KR101244063B1

KR101244063B1 - Execution control during program code conversion

Info

Publication number: KR101244063B1
Application number: KR1020077022472A
Authority: KR
Inventors: 가빈 바라클러프; 키트 만 완; 알렉산더 바라클러프 브라운; 다비드 니겔 매킨토시
Original assignee: 인터내셔널 비지네스 머신즈 코포레이션
Priority date: 2005-03-11
Filing date: 2006-03-06
Publication date: 2013-03-19
Also published as: KR20070110898A; EP1866761A1; WO2006095155A1

Abstract

실행 제어 방법이, 주체 코드(17)를 목표 코드(21)로 변환하는 번역기(19)에 이용되기 위해 설명된다. 상기 번역기(19)는, 번역기 실행 루프(190)로부터 호출되고, 그 다음으로, 목표 코드(21)를 생성하기 위해 번역기 코드 생성기(192)를 호출하거나, 아니면 실행을 위해 이전에 생성된 목표 코드(212)를 호출하는, 번역기 트램폴린 함수(191)를 포함한다. 그 후, 제어는 새로운 호출을 만들기 위해 상기 번역기 트램폴린 함수(191)로 되돌아 가거나, 또는 상기 번역기 실행 루프(190)로 되돌아간다. 다른 측면들은, 상기 트램폴린 함수(191)를 통해 문맥 스위치들을 하는 단계와, 제 1 및 제 2 호출 규약들을 상기 트램폴린 함수(191)의 어느 한 측에 셋팅하는 단계를 포함한다. 실행 동안, 목표 코드 블럭들(212) 사이에서 직접 또는 간접으로 점핑하는 것이 또한 설명된다. The execution control method is described for use in the translator 19 which converts the subject code 17 into the target code 21. The translator 19 is called from the translator execution loop 190 and then calls the translator code generator 192 to generate the target code 21, or otherwise previously generated target code for execution. A translator trampoline function 191 that calls 212. Control then returns to the translator trampoline function 191 or to the translator execution loop 190 to make a new call. Other aspects include making context switches via the trampoline function 191 and setting first and second calling conventions on either side of the trampoline function 191. During execution, jumping directly or indirectly between target code blocks 212 is also described.

코드 변환, 번역기, 주체 프로그램, 목표 코드, 목표 프로세서, 실행 루프, 트램폴린 함수, 호출 규약 Transcoding, translator, subject program, target code, target processor, execution loop, trampoline function, calling convention

Description

프로그램 코드 변환시의 실행 제어{EXECUTION CONTROL DURING PROGRAM CODE CONVERSION}Execution control at the time of program code conversion {EXECUTION CONTROL DURING PROGRAM CODE CONVERSION}

본 발명은 일반적으로 컴퓨터들과 컴퓨터 소프트웨어 분야에 관한 것으로, 특히, 예를 들어 프로그램 코드를 변환하는 코드 번역기들, 에뮬레이터들 및 액셀러레이터들에서 유용한 프로그램 코드 변환 방법들 및 장치에 관한 것이다.The present invention relates generally to the fields of computers and computer software, and more particularly to program code conversion methods and apparatus useful in code translators, emulators and accelerators for converting program code, for example.

내장 및 비 내장된 CPU들 모두에서, 소프트웨어의 큰 기관들이 그를 위해 존재하는, 널리 퍼진(predominant) 명령어 집합 아키텍쳐들(Instruction Set Architectures: ISAs)이 있으며, 그들이 관련 소프트웨어에 투명하게 접근할 수 있다는 가정하에, 그들은 성능을 위해 "가속화(accelerated)"될 수 있으며, 또는 더 나은 비용/성능 이익들을 제시할 수 있는 무수한 유능한 프로세서들로 "번역(tranlsated)"될 수 있다. 어떤 이는, 시간이 되면 자신의 ISA로 락킹되고 성능 또는 시장 범위에 있어 진화할 수 없는 주요한 CPU 아키텍쳐들을 발견한다. 그러한 CPU들은 소프트웨어-지향 프로세서 공동 아키텍쳐로(software-oriented processor co-arthitecture)부터 이익을 얻는다.On both embedded and non-embedded CPUs, there are predominant Instruction Set Architectures (ISAs), where large institutions of software exist for him, assuming they have transparent access to the relevant software. Underneath, they can be "accelerated" for performance, or "tranlsated" with a myriad of capable processors that can present better cost / performance benefits. Some find major CPU architectures that, in time, lock into their ISAs and cannot evolve in performance or market range. Such CPUs benefit from a software-oriented processor co-arthitecture.

그러한 가속, 번역 및 공동 아키텍쳐 능력들을 용이하게 하기 위한 프로그램 코드 변환 방법 및 장치는, 예를 들어 공개된 PCT 출원 WO00/22521, 및 기타에 게 시되어 있다.Program code conversion methods and apparatuses for facilitating such acceleration, translation and co-architecture capabilities are disclosed, for example, in published PCT application WO00 / 22521, and others.

프로그램 코드 변환을 수행하기 위한 번역기(translator)는, 번역기의 동작과 관련이 있는 오버헤드들(overheads)을 필연적으로 수반한다. 특히, 동적 번역(dynamic translation)은, 번역기로 하여금 번역된 프로그램 코드와 일치하여 동작할 것을 요구하며, 그로 인해 호스트 CPU의 실행은 번역기 프로그램과 번역된 코드 사이에서 스위치된다. 이러한 스위치를 수행하는 것은 상당한 작업을 수반하며, 시간 지연을 야기한다.A translator for performing program code conversion inevitably involves overheads related to the operation of the translator. In particular, dynamic translation requires the translator to operate in accordance with the translated program code, whereby the execution of the host CPU is switched between the translator program and the translated code. Performing such a switch involves considerable work and causes a time delay.

본 발명은, 프로그램 코드 변환을 실행하는 경우의 번역기의 성능을 향상시킨다.The present invention improves the performance of the translator in the case of executing program code conversion.

본 발명의 바람직한 실시예들은, 특히 동적 번역시, 번역기의 실행과 번역된 코드의 실행 사이에서 문맥(context) 스위치들을 수행하는 것과 관련된 오버헤드를 감소시킨다.Preferred embodiments of the invention reduce the overhead associated with performing context switches between the execution of the translator and the execution of the translated code, especially in dynamic translation.

본 발명에 따르면, 첨부된 청구항들에서 언급된 장치 및 방법이 제공된다. 본 발명의 바람직한 특징들은, 종속항들 및 다음에 이어지는 설명으로부터 명백해질 것이다.According to the invention, there is provided an apparatus and method as mentioned in the appended claims. Advantageous features of the invention will be apparent from the dependent claims and the description that follows.

다음은, 본 발명에 따른 실시예들에 따라 실현 가능한 다양한 측면들과 이점들의 요약이다. 그것은, 관련기술의 당업자들이 다음에 이어지는 그리고 이어지지 않는 상세한 설계 논의(discussion)을 더 빠르게 이해하도록 도와주는 도입으로서 제공되며, 어떤 식으로도 여기 첨부된 청구항들의 영역을 제한하도록 의도되지 않는다. The following is a summary of various aspects and advantages that can be realized according to embodiments according to the present invention. It is provided as an introduction to help those skilled in the art to more quickly understand the detailed design discussions that follow and do not ensue, and are not intended to limit the scope of the claims appended hereto in any way.

특히, 발명자들은 신속한 프로그램 코드 변환(expediting program code conversion)에 관한 최적화 기술, 특히 주체 프로그램 코드(subject program code)를 목표 코드(target code)로 번역하는 것을 이용하는 실행시간 번역기(run-time translator)와 관련해서 유용한 최적화 기술을 개발해왔다.In particular, the inventors have developed an optimization technique for expediting program code conversion, in particular a run-time translator that utilizes the translation of the subject program code into a target code. Related optimization techniques have been developed.

본 발명의 제 1 측면에 따르면, 주체 코드를 목표 코드로 변환하는 경우의 실행 제어를 위한 방법은, 번역기 실행 루프(run loop)로부터 호출을 받고, 그 다음으로, 추가적인 목표 코드를 생성하도록 번역기 코드 생성기를 호출하거나, 아니면 실행을 위해 이전에 생성된 목표 코드를 호출하는, 번역기 트램폴린 함수(translator trampoline function)를 제공하는 단계를 포함한다. 그 다음으로, 제어는 새로운 호출 생성을 위해 번역기 트램폴린 함수로 돌아가거나, 또는 트램폴린 함수를 통해 번역기 실행 루프로 돌아간다. According to a first aspect of the present invention, a method for execution control in the case of converting subject code into target code includes receiving a call from a translator run loop and then generating additional target code. Providing a translator trampoline function that invokes the generator or otherwise calls the previously generated target code for execution. Control then returns to the translator trampoline function to create a new call, or to the translator execution loop via the trampoline function.

바람직하게는, 제 1 및 제 2 호출 규약들(calling conventions)은 트램폴린 함수의 어느 한 측면에 대해 적용된다. 호출 규약을 수정하는 것은, 파라미터들이, 가령 번역기 코드 생성기 함수 및 실행된 목표 코드 사이에서, 특히 목표 프로세서 레지스터들(target processor resistors)을 이용하여, 통과될 수 있도록 허락한다. 바람직하게는, 상기 파라미터들은, 주체 코드의 현재 블럭을 참조하는 블럭 식별자들 및/또는 주체 프로세서 상태에 관한 파라미터들을 포함한다. Preferably, the first and second calling conventions apply for either aspect of the trampoline function. Modifying the calling convention allows the parameters to be passed, for example between the translator code generator function and the executed target code, in particular using target processor resistors. Advantageously, said parameters comprise block identifiers referencing the current block of subject code and / or parameters relating to subject processor state.

바람직하게는, 상기 트램폴린 함수는 또한, 가령 제 1 및 제 2 모드들 사이에서, 목표 프로세서(target processor)의 프로세서 모드 스위치을 수행한다. 적절하게, 제 1 모드는 번역기 생성기 함수(즉, 번역기 문맥(context))를 호출하는 경우에 적용되며, 상이한 제 2 모드는 생성된 목표 코드의 실행을 위해 적용된다.Advantageously, the trampoline function also performs a processor mode switch of a target processor, for example between the first and second modes. Appropriately, the first mode is applied when calling a translator generator function (ie, a translator context), and a different second mode is applied for the execution of the generated target code.

본 발명의 다른 바람직한 측면에 따르면, 제 1 목표 코드 블럭에서 그 후속 블럭으로, 번역기 트램폴린 함수를 통해 간접적으로나, 아니면 블럭에서 블럭으로 직접적으로 점프하는 것이 허락된다. 바람직하게는, 상기 방법은, 링킹(linking) 파라미터를 제공하기 위해 테일 명령(tail instructions)을 가진 각 목표 코드 블럭을 생성하는 단계를 포함한다. 상기 링킹 파라미터는, 가령 후속 블럭이 저장되거나 후속 블럭을 나타내는 블럭 객체가 저장되는 메모리 어드레스에 링크되는 것에 의해, 후속 블럭에 링크된다. 하나의 실시예에서, 상기 링킹 파라미터는 점프 명령에서 오퍼랜드(operand)로서 이용된다.According to another preferred aspect of the invention, it is permitted to jump from the first target code block to the subsequent block, either indirectly through the translator trampoline function or directly from block to block. Advantageously, the method comprises generating each target code block with tail instructions to provide a linking parameter. The linking parameter is linked to the next block, for example by linking to a memory address where the next block is stored or a block object representing the next block is stored. In one embodiment, the linking parameter is used as an operand in a jump instruction.

본 발명의 다른 실시예에 따르면, 바람직한 방법은, 제 1 번역기 트램폴린 함수 내에 내포되는(nested) 제 2 번역기 트램폴린 함수를 제공하는 단계를 포함한다. 바람직하게는, 상기 제 2 번역기 트램폴린 함수는 실행 목표 코드 블럭으로부터 호출되며, 그 다음으로, 상기 실행 목표 코드 블럭은, 추가적인 목표 코드를 생성하기 위해 내포된 번역기 함수를 호출한다. According to another embodiment of the invention, the preferred method comprises providing a second translator trampoline function nested within the first translator trampoline function. Advantageously, said second translator trampoline function is called from an execution target code block, and then said execution target code block calls a nested translator function to generate additional target code.

본 발명의 또 따른 측면에 따르면, 바람직한 방법은, 실행 제어가 목표 코드와 함께 남아 있을 것인지, 아니면 번역기 실행 루프로 되돌아갈 것인지를 결정하기 위해, 프로파일링 체크(profiling check)를 수행하는 단계를 포함한다. 상기 프로파일링 체크는, 특히 목표 코드 블럭들 사이에서 점프하는 경우에 적용된다. 바람직하게는, 상기 프로파일링 체크는, 목표 코드 블럭의 실행을 위한 카운터 값(counter value)을 유지하고, 상기 카운터 값을 소정의 문턱값과 비교하는 단계를 포함한다. 목표 코드가 소정의 횟수로 반복되면, 실행 제어는 강제적으로 번역 실행 루프로 돌아간다.According to another aspect of the invention, a preferred method includes performing a profiling check to determine whether execution control will remain with the target code or return to the translator execution loop. do. The profiling check applies in particular when jumping between target code blocks. Advantageously, said profiling check comprises maintaining a counter value for execution of a target code block and comparing said counter value with a predetermined threshold. If the target code is repeated a predetermined number of times, execution control is forced to return to the translation execution loop.

또한, 발명은, 여기서 설명한 방법들을 실행하기 위한 프로그램 소프트웨어가 기록된 컴퓨터-독취 가능한 저장 매체로 확장된다. 또한, 본 발명은, 예를 들면 번역기 장치 또는 기타 다른 계산 장치에서의 그러한 소프트웨어와 협력하여 컴퓨터 프로세서로 확장된다. The invention also extends to a computer-readable storage medium having recorded thereon program software for carrying out the methods described herein. The invention also extends to a computer processor in cooperation with such software, for example in a translator device or other computing device.

도 1은 주체 코드를 목표 코드로 번역하는 것을 도시한 설명적인 컴퓨팅 환경의 개략적인 도,1 is a schematic diagram of an illustrative computing environment illustrating the translation of a subject code into a target code;

도 2는 프로그램 코드 변환시의 실행 제어의 예시적인 방법을 도시한 개략적인 순서도,2 is a schematic flowchart illustrating an exemplary method of execution control in program code conversion;

도 3은 주체 프로그램의 일 예를 도시한 개략적인 묘사,3 is a schematic depiction showing an example of a subject program,

도 4는 목표 아키텍쳐에서의 실행 제어를 도시한 개략적인 도,4 is a schematic diagram illustrating execution control in a target architecture;

도 5는 도 2, 도 3, 및 도 4의 실행 제어 방법의 작업부하 분배를 도시한 도,5 is a diagram illustrating workload distribution of the execution control method of FIGS. 2, 3, and 4;

도 6은 프로그램 코드 변환시의 실행 제어의 바람직한 방법의 개요도,6 is a schematic diagram of a preferred method of execution control during program code conversion;

도 7은 바람직한 방법에서의 작업부하 분배를 도시한 도,7 shows the workload distribution in the preferred method;

도 8은 프로그램 코드 변환시의 제 1 및 제 2 호출 규약들을 이용하는 바람직한 방법을 도시한 도,8 illustrates a preferred method of using first and second calling conventions in program code conversion;

도 9는 프로그램 코드 변환시의 실행 제어의 다른 바람직한 방법을 도시한 도,9 illustrates another preferred method of execution control in converting program code;

도 10은 본 발명의 바람직한 실시예들에서 이용되는 저장된 블럭 객체의 일예를 도시한 도,10 illustrates an example of a stored block object used in preferred embodiments of the present invention.

도 11은 프로그램 코드 변환시의 실행 제어의 또 다른 바람직한 방법을 도시한 도,FIG. 11 shows another preferred method of execution control in converting program code; FIG.

도 12는 라이트웨이트 프로파일링 체크(lightweight profiling check)를 포함하는 바람직한 실행 제어 방법을 도시한 도,12 illustrates a preferred execution control method including a light weight profiling check.

도 13은 라이트웨이트 프로파일링 체크를 포함하는 다른 바람직한 실행 제어 방법을 도시한 도,FIG. 13 illustrates another preferred execution control method that includes a light weight profiling check. FIG.

도 14는 프로그램 코드 변환시의 실행 제어의 다른 바람직한 방법으로서, 내포되는 제어 루프들을 이용하는 방법을 도시한 도.14 illustrates a method of using nested control loops as another preferred method of execution control in program code conversion.

상세한 설명의 일부에 통합되고 상세한 설명의 일부를 구성하는 첨부된 도면들은, 현재 제시되는 바람직한 수행들을 도시하고 있으며, 다음과 같이 설명된다.The accompanying drawings, which are incorporated in and constitute a part of the detailed description, illustrate preferred embodiments presently presented and are described as follows.

다음의 설명은, 관련기술의 당업자들이 본 발명을 만들고 이용할 수 있도록 제공되며, 발명자들이 그들의 발명을 실행함으로써 고려되는 최선의 모드들을 기술한다. 그러나, 본 발명의 일반적인 원리들이, 특히 개선된 프로그램 코드 변환 방법 및 장치를 제공하기 위해 여기에서 정의되어 왔으므로, 다양한 변경들이 당업자들에게 쉽게 명료해질 것이다. The following description is provided to enable any person skilled in the art to make or use the present invention and to describe the best modes contemplated by the inventors as they practice their invention. However, various modifications will be readily apparent to those skilled in the art, as the general principles of the invention have been defined herein, in particular to provide an improved method and apparatus for program code conversion.

아래에 사용되는 용어에서, 주체 프로그램(subject program)은, 주체 프로세서를 포함하는 주체 컴퓨팅 플랫폼에서 실행하도록 의도된 것이다. 목표 프로세서를 포함하는 목표 컴퓨팅 플랫폼은, 동적 프로그램 코드 변환을 수행하는 번역기를 통해, 상기 주체 프로그램을 실행하기 위해 이용된다. 번역기는 주체 코드에서 목표 코드로 코드 변환을 수행함으로써, 목표 코드가 목표 컴퓨팅 플랫폼상에서 실행가능해진다.In the terminology used below, a subject program is intended to run on a subject computing platform that includes a subject processor. A target computing platform comprising a target processor is used to execute the subject program through a translator that performs dynamic program code conversion. The translator performs code conversion from the subject code to the target code so that the target code is executable on the target computing platform.

도 1은 복수 개의 목표 레지스터들(target resisters)(15)을 가지고 있는 목표 프로세서(target processor)(13), 및 복수 개의 소프트웨어 구성요소들(software components)(17, 19, 20, 21, 27)을 저장하기 위한 메모리(18)를 포함하는 목표 컴퓨팅 플랫폼(target computing platform)을 도시한다. 상기 소프트웨어 구성요소들은 운영시스템(20), 주체 코드(17), 번역기 코드(19), 및 번역된 목표 코드(21)를 포함한다.1 shows a target processor 13 having a plurality of target resisters 15, and a plurality of software components 17, 19, 20, 21, 27. A target computing platform is shown that includes a memory 18 for storing data. The software components include an operating system 20, subject code 17, translator code 19, and translated target code 21.

일 실시예에서, 상기 번역기 코드(19)는, 주체 ISA의 주체 코드를 다른 ISA의 번역된 목표 코드로, 최적화와 함께 또는 최적화 없이, 번역하는 에뮬레이터(emulator)이다. 다른 실시예에서, 번역기 코드는, 주체 코드를 같은 ISA의 목표 코드로 번역하기 위한 액셀러레이터(accelerator)로서 기능한다. In one embodiment, the translator code 19 is an emulator that translates the subject code of the subject ISA into translated target code of another ISA, with or without optimization. In another embodiment, the translator code functions as an accelerator for translating the subject code into the target code of the same ISA.

번역기(19) 즉, 번역기를 실행하는 소스 코드의 컴파일된 버전(compiled version)과, 번역된 코드(21) 즉, 번역기(19)에 의해 생성되는 주체 코드(17)의 번역은, 전형적으로 마이크로 프로세서이거나 또는 다른 적절한 컴퓨터인 목표 프로세서(13)에서 실행하는 운영 시스템(20)과 함께 실행된다.The translator 19, i.e. a compiled version of the source code that executes the translator, and the translation of the translated code 21, i.e. the subject code 17 generated by the translator 19, typically It runs in conjunction with operating system 20 running on target processor 13, which is a processor or other suitable computer.

도 1에 도시된 구조는 예에 불과하며, 예를 들어 본 발명에 따른 소프트웨어, 방법들, 처리들이, 운영 시스템 내부 또는 아래에 존재하는 코드로 실행될 수 있다는 것이 이해될 것이다. 주체 코드(17), 번역기 코드(19), 운영 시스템(20), 및 메모리(18)의 저장 메커니즘은, 관련기술의 당업자에게 알려진 매우 다양한 유형들 중 어느 하나일 수 있다.It is to be understood that the structure shown in FIG. 1 is merely an example, and that, for example, software, methods, and processes in accordance with the present invention may be executed by code residing within or under an operating system. The storage mechanism of the subject code 17, the translator code 19, the operating system 20, and the memory 18 can be any of a wide variety of types known to those skilled in the art.

인터리브된Interleaved 코드 실행( Code execution ( InterleavedInterleaved CodeCode ExecutionExecution ) )

도 1에 따른 장치에 있어서, 바람직하게는, 목표 코드(21)가 실행되는 동안, 프로그램 코드 변환은 실행시간에 동적으로 수행된다. 번역기(19)는, 번역된 프로그램(21)과 일치하여 실행된다. 바람직하게는, 번역기(19)는 목표 아키텍쳐를 위해 컴파일된 응용 프로그램으로서 이용된다. 주체 프로그램(17)은 목표 아키텍쳐에서 실행되기 위해 실행시간(run-time)에 번역기(19)에 의해 번역된다.In the apparatus according to FIG. 1, preferably, while the target code 21 is executed, program code conversion is performed dynamically at runtime. The translator 19 is executed in accordance with the translated program 21. Preferably, the translator 19 is used as an compiled application program for the target architecture. The subject program 17 is translated by the translator 19 at run-time to run on the target architecture.

번역기(19)를 통해 주체 프로그램(17)을 실행하는 것은, 인터리브된 방법으로 실행되는 두 개의 상이한 유형의 코드인, 번역기 코드(19)와 목표 코드(21)를 수반한다. 번역기 코드(19)는, 번역기(19)의 높은 수준의 소스코드 수행(high-level source code implementation)을 기초로, 실행시간 이전에, 가령 컴파일러(compiler)에 의해 생성된다. 반대로, 목표 코드(21)는 번역되는 프로그램의 저장된 주체 코드(17)에 기초하여, 실행시간에 걸쳐서, 번역기 코드(19)에 의해 생성된다.Executing the subject program 17 through the translator 19 involves a translator code 19 and a target code 21, two different types of code executed in an interleaved manner. Translator code 19 is generated, such as by a compiler, prior to runtime, based on the high-level source code implementation of translator 19. In contrast, the target code 21 is generated by the translator code 19 over the runtime, based on the stored subject code 17 of the program being translated.

주체 프로그램(17)은 주체 프로세서(미도시)에서 실행되도록 의도된다. 일 실시예에서, 번역기(19)는 에뮬레이터로서 기능한다. 즉, 번역기(19)는, 주체 프로 그램(17)을 목표 프로세서(13) 상에서 목표 코드(21)로서 실제적으로 실행하는 동안, 주체 프로세서를 에뮬레이트한다. 바람직한 실시예에서, 적어도 하나의 글러벌 레지스터 스토어(27)가 제공된다(주체 레지스터 뱅크(27)라고도 함). 다중 프로세서 환경에서는, 주체 프로세서의 아키텍쳐에 따라 하나 이상의 추상(abstract) 레지스터 뱅크(27)가 선택적으로 제공된다. 주체 프로세서 상태에 대한 표현(representation)이 번역기(19)와 목표 코드(21)의 구성요소들에 의해 제공된다. 즉, 번역기(19)는, 주체 프로세서 상태를 변수들(variables) 및/또는 객체들과 같이 다양한 명시적(explicit) 프로그래밍 언어 장치들로 저장되며, 번역기를 컴파일하기 위해 사용되는 컴파일러는, 상태와 동작들이 목표 코드로 어떻게 수행되는지 결정한다. 비교해보면, 목표 코드(21)는, 주체 프로세서 상태를, 목표 코드(21)의 목표 명령에 의해 조종되는 메모리 로케이션들(18)과 목표 레지스터들(15)에 암시적으로(implicitly) 제공한다. 예를 들면, 글로벌 레지스터 스토어(27)의 낮은 수준의 표현(low-level representation)는 간단하게는 할당된 메모리의 영역이다. 그러나, 번역기(19)의 소스 코드에서, 글로벌 레지스터 스토어(27)는, 높은 수준에서 접근가능하고 조종가능한 객체 또는 데이터 어레이이다.The subject program 17 is intended to be executed in a subject processor (not shown). In one embodiment, the translator 19 functions as an emulator. That is, the translator 19 emulates the subject processor while actually executing the subject program 17 on the target processor 13 as the target code 21. In a preferred embodiment, at least one global register store 27 is provided (also referred to as subject register bank 27). In a multiprocessor environment, one or more abstract register banks 27 are optionally provided depending on the architecture of the subject processor. Representation of the subject processor state is provided by the components of the translator 19 and the target code 21. That is, the translator 19 stores the subject processor state in various explicit programming language devices, such as variables and / or objects, and the compiler used to compile the translator, Determine how actions are performed with the target code. In comparison, the target code 21 implicitly provides the subject processor state to the memory registers 18 and the target registers 15 controlled by the target instruction of the target code 21. For example, the low-level representation of the global register store 27 is simply an area of allocated memory. However, in the source code of the translator 19, the global register store 27 is an object or data array that is accessible and steerable at a high level.

도 2는 프로그램 코드 변환시의 실행제어의 예시적인 방법을 도시한 개략적인 순서도이다.2 is a schematic flowchart illustrating an exemplary method of execution control in program code conversion.

도 2에 도시된 바와 같이, 제어는, 처음에는 번역기 제어 루프(190)에 존재한다. 단계 201에서, 제어 루프(190)는, 주체 코드(17)의 블럭을 번역된 코드(21)의 대응 블럭으로 번역하는 번역기 코드(19)의 코드 생성 함수(192)를 호출한다. 그 다음, 단계 202에서, 번역된 코드(21)의 블럭은, 목표 프로세서(13)에서 실행된다. 편리하게는, 번역된 코드(21)의 각 블럭의 단부(end)는, 제어를 다시 제어 루프(190)로 되돌리기 위한 명령을 포함한다. 다시 말해, 주체 코드를 번역하고 실행하는 단계들이 인터레이스됨으로써, 주체 프로그램(17)의 일부들이 번역되고 그 다음으로 실행된다. As shown in FIG. 2, control initially resides in the translator control loop 190. In step 201, the control loop 190 calls the code generation function 192 of the translator code 19 which translates the block of subject code 17 into the corresponding block of translated code 21. Then, in step 202, the block of translated code 21 is executed in the target processor 13. Conveniently, the end of each block of translated code 21 includes instructions for returning control back to control loop 190. In other words, the steps of translating and executing the subject code are interlaced so that portions of the subject program 17 are translated and subsequently executed.

여기서, "기본 블럭(basic block)"이란 용어는 당업자들에게 익숙할 것이다. 기본 블럭은, 정확하게 하나의 입력점(entry point)과 정확하게 하나의 종료점(exit point)을 가지는 코드의 한 섹션으로 정의되며, 블럭 코드를 단일 제어 경로(single control path)로 한정한다. 이러한 이유로, 기본 블럭들은 제어 흐름의 유용한 기본적인 단위이다. 적절하게는, 번역기(19)는 주체 코드(17)를 복수 개의 기본 블럭들로 분류하며, 여기서 각 기본 블럭은, 유일한 입력점에서의 제 1 명령과, 유일한 종료점에서의 마지막 명령 사이에 있는 순차적 세트의 명령들이다 (예를 들면, 점프, 호출, 또는 분기 명령). 번역기는, 이러한 기본 블럭들 중 단지 하나를 선택하거나(블럭 모드), 기본 블럭들의 일 그룹을 선택한다(그룹 블럭 모드). 그룹 블럭은, 단일 유닛으로 함께 처리될 두 개 또는 그 이상의 기본 블럭들을 적절하게 포함한다. 또한, 번역기는, 주체 코드의 동일한 기본 블럭을 나타내지만 상이한 엔트리 조건들(entry conditions) 하에 있는 이소-블럭들(iso-blocks)을 형성할 수 있다.Here, the term "basic block" will be familiar to those skilled in the art. A basic block is defined as a section of code that has exactly one entry point and exactly one exit point and limits the block code to a single control path. For this reason, basic blocks are useful basic units of control flow. Suitably, the translator 19 classifies the subject code 17 into a plurality of basic blocks, where each basic block is a sequential between the first instruction at a unique entry point and the last instruction at a unique endpoint. A set of instructions (eg, a jump, call, or branch instruction). The translator selects only one of these basic blocks (block mode) or selects a group of basic blocks (group block mode). The group block suitably includes two or more basic blocks to be processed together in a single unit. The translator may also form iso-blocks that represent the same basic block of subject code but are under different entry conditions.

바람직한 실시예들에서, IR 트리(tree)들이, 오리지날 주체 프로그램(17)으로부터 목표 코드(21)를 생성하는 과정의 일부로서, 주체 명령 시퀀스에 기초하여 생성된다. IR 트리들은, 주체 프로그램에 의해 수행되는 동작들과 계산되는 표현들의 추상적인 표현들(abstract representations)이다. 이 후에, 목표 코드(21)가 IR 트리들에 기초하여 생성된다. IR 노드들의 집합은, 실제로 방향성 비순환 그래프들(directed acyclic graphs(DAGs))이지만, 구어로 "트리들"로 언급된다.In preferred embodiments, IR trees are generated based on the subject instruction sequence as part of the process of generating the target code 21 from the original subject program 17. IR trees are abstract representations of the operations performed by the subject program and the representations to be calculated. After this, the target code 21 is generated based on the IR trees. The set of IR nodes is actually directed acyclic graphs (DAGs), but is colloquially referred to as "trees."

일 실시예에서, 당업자들은, 번역기(19)가 C++와 같은 객체 지향 프로그래밍 언어(object-oriented programming language)를 이용하여 수행되는 것을 알 수 있을 것이다. 예를 들면, IR노드가 C++ 객체로서 이행되고, 다른 노드들에 대한 참조(reference)가, 그러한 다른 노드들에 대응하는 C++ 객체들에 대한 C++ 참조로서 이행된다. 그러므로, IR 트리는 IR 노드 객체들의 집합으로서 이행되어, 서로에 대해 다양한 참조들을 포함한다.In one embodiment, those skilled in the art will appreciate that the translator 19 is performed using an object-oriented programming language such as C ++. For example, an IR node is implemented as a C ++ object, and references to other nodes are implemented as C ++ references to C ++ objects corresponding to those other nodes. Therefore, the IR tree is implemented as a collection of IR node objects, containing various references to each other.

또한, 기술 중에 있는 실시예에서는, IR 생성은, 주체 프로그램(17)이 실행되도록 의도되는 주체 아키텍쳐의 특정한 특징들에 대응하는 일 셋트의 추상적 레지스터 정의들(abstract register definitions)을 이용한다. 예를 들면, 주체 아키텍쳐 상의 각 물리적 레지스터("주체 레지스터")에 대한 유일한 추상적 레지스터 정의가 있다. 따라서, 번역기에서의 추상적 레지스터 정의들은, IR 노드 객체(즉, IR 트리)에 대한 참조를 포함하는 C++ 객체로서 이행될 수 있다. 일 셋트의 추상적 레지스터 정의들에 의해 참조되는 모든 IR 트리들의 집합은, 작업 IR 포레스트(working IR forest)(이것은, 각각이 IR 트리를 언급하는 다중 추상적 레지스터 루츠(root)들을 포함하므로 "포레스트")로 불리어진다.In addition, in an embodiment in the art, IR generation uses a set of abstract register definitions that correspond to specific features of the subject architecture in which the subject program 17 is intended to be executed. For example, there is a unique abstract register definition for each physical register ("subject register") on the subject architecture. Thus, abstract register definitions in the translator can be implemented as a C ++ object that contains a reference to an IR node object (ie, an IR tree). The set of all IR trees referenced by a set of abstract register definitions is a working IR forest ("forest" because it contains multiple abstract register roots, each referring to an IR tree). It is called.

이러한 IR 트리들과 다른 과정들은, 번역기 코드 생성 함수(192)의 일부를 형성한다.These IR trees and other processes form part of the translator code generation function 192.

프로그램 코드 변환시의 실행제어의 일 예가, 본 발명의 배경을 더 이해하고 문제들을 해결하기 위해 도 3 및 도 4를 참조하여 아래에서 더 설명될 것이다.An example of execution control in program code conversion will be further described below with reference to FIGS. 3 and 4 to further understand the background of the present invention and to solve problems.

도 3은, 주체 프로그램(17)의 기능부들(300)을 도시한 개략적인 도이다. 기능부들은 그들 사이에서 주체 프로그램 흐름을 나타내는 화살표들과 함께 블럭들 A, B, 및 C로 표시된다. 본 예에서, 블럭 B는, 블럭 B내에서 루프형태로 다시 돌아가는지, 아니면 블럭 C로 계속 진행할 것인지에 대한 결정을 수반하는 루프형 구조를 포함한다.3 is a schematic diagram showing the functional units 300 of the subject program 17. The functional units are represented by blocks A, B, and C with arrows indicating the subject program flow between them. In this example, block B includes a looped structure that involves determining whether to go back to loop in block B or continue to block C.

도 4는, 목표 아키텍쳐에서 실행제어를 나타내는 개략적인 도이다. 프로그램 흐름은 도표를 따라 아래 방향으로 실행되며, 여기서 칼럼들은 번역기(19)의 실행과 번역된 코드(21)의 실행 사이를 통과하는 실행제어를 보여준다.4 is a schematic diagram illustrating execution control in a target architecture. The program flow is executed in a downward direction along the diagram, where the columns show the execution control passing between the execution of the translator 19 and the execution of the translated code 21.

번역기(19)는 번역기의 동작과 목표 코드 실행을 제어하기 위해 실행 루프(run loop; 190)를 적절하게 포함한다. 실행 루프(190)는, 메모리에 저장되어 있는 이전에 번역된 목표 코드(21)(예를 들면, TC_A, TC_B 또는 TC_C)를 호출하거나, 아니면 메모리(18)에 저장되어 있는 주체 코드 A, B, 또는 C의 대응하는 블럭으로부터 그러한 번역된 목표 코드를 생성하기 위해, 번역기 코드 생성 함수(192)를 호출한다. 하나 또는 그 이상의 파라미터들(400)은 실행 동안 이러한 함수들 사이에서 통과되어, 특히 고려중인 현재의 블럭(A, B, 또는 C), 및 주체 프로세서 상태에 관한 정보를 나타낸다.The translator 19 suitably includes a run loop 190 to control the translator's operation and target code execution. Execution loop 190 calls previously translated target code 21 (eg, TC _A , TC _B, or TC _C ) stored in memory, or otherwise, subject code stored in memory 18. To generate such translated target code from the corresponding block of A, B, or C, the translator code generation function 192 is called. One or more parameters 400 are passed between these functions during execution, indicating information about the current block (A, B, or C) under consideration, and the subject processor state in particular.

몇몇의 문제점들이 도 4에 도시된 실행제어 구조로 확인된다.Some problems are identified with the execution control structure shown in FIG.

먼저, 실행경로가 비선형적이며, 메모리 내의 임의의 위치로 많은 점프들이 일어난다. 특히, 실행은, 현재 블럭(A, B, 또는 C)을 나타내는 파라미터에 의해 부분적으로 결정되는 점프와 함께, 실행 루프(190), 코드 생성기(192), 및 목표 코드(21) 사이에서 자주 스위치된다. 이러한 점프들은 정확하게 예측할 수 없으며, 이것은 많은 프로세서 아키텍쳐들에서 예측 향상의 효율성을 현저하게 감소시킨다.First, the execution path is nonlinear and many jumps occur to any location in memory. In particular, execution frequently switches between execution loop 190, code generator 192, and target code 21, with jumps determined in part by parameters representing the current block (A, B, or C). do. These jumps cannot be predicted accurately, which significantly reduces the efficiency of prediction enhancement in many processor architectures.

또한, 번역과 목표 코드 실행 간의 각각의 스위치는, 상당한 작업을 수반한다. 아래에서 더 상세하게 설명되듯이, 그러한 문맥 스위치는, 소정의 조건들을 충족하기 위해 그리고 정착된(settled) 상태에 도달하기 위해, 예를 들면 글로벌 레지스터 스토어(27)로 레지스터들을 저장(saving)하거나 복구(restoring)하는 것을 포함한다. 전형적으로, 문맥 스위치는 목표 프로세서(13)에 의한 열개 또는 그 이상의 명령들의 실행을 요구한다. 기본 블럭(A, B, 또는 C) 자체가 단지 다섯 내지 열개의 명령들을 포함할 수 있으며, 따라서 각 문맥 스위치는 목표 아키텍쳐의 요구되는 작업에 상당하게 더해진다.In addition, each switch between translation and target code execution involves considerable work. As described in more detail below, such a context switch may, for example, save registers in a global register store 27 to meet certain conditions and to reach a settled state. Includes restoring. Typically, the context switch requires the execution of ten or more instructions by the target processor 13. The basic block (A, B, or C) itself can contain only five to ten instructions, so each context switch adds significantly to the required work of the target architecture.

문맥 스위칭(Context switching ( ContextContext switchingswitching ))

번역기(19)의 실행과 목표 코드(21)의 실행 사이의 스위칭은, 여기서 문맥 스위치라 불리어진다. 목표 코드(21)의 각 블럭은, 목표 프로세서(13) 및 메모리(18)의 상태에 대한 일 셋트의 가정들에 기초하여 실행된다. 유사하게, 번역기(19)의 실행은 목표 장치의 상태에 관한 가정들에 집착한다. 특히, 호출 규약은 레지스터 역할들 및/또는 레지스터 보호(preservation)과 같은 속성들을 결정하도 록 정의된다.The switching between the execution of the translator 19 and the execution of the target code 21 is called a context switch here. Each block of target code 21 is executed based on a set of assumptions about the state of target processor 13 and memory 18. Similarly, the execution of the translator 19 clings to assumptions about the state of the target device. In particular, the calling convention is defined to determine attributes such as register roles and / or register protection.

레지스터 역할들에 관해서, 특정한 레지스터들은 일반적으로 특정 역할들을 할당받는다. 다양한 일반적인 목적 레지스터들이 또한 제공될 수 있다. 예를 들면, x86 프로세서 아키텍쳐를 위해 일반적으로 사용되는 호출 규약에서, 레지스터 ESP는 스택 포인터(stack pointer)로서 정의되고, 레지스터 EBP는 프레임/베이스 포인터로서 정의된다. 이러한 역할들과 다른 역할들은 그 ISA와 프로세서를 위한 호출 규약의 일부를 형성하며, 코드 실행이 효과적으로 기능할 수 있도록 한다.With regard to register roles, certain registers are generally assigned specific roles. Various general purpose registers may also be provided. For example, in the calling convention commonly used for x86 processor architectures, register ESP is defined as a stack pointer, and register EBP is defined as a frame / base pointer. These and other roles form part of the calling conventions for the ISA and the processor, allowing code execution to function effectively.

레지스터 보호는, 목표 장치에서 프로그램 코드의 상이한 섹션들의 실행을 통해 이동할 때, 어떤 레지스터 콘텐츠가 보호되어야 하는지(예를 들면, 스택에 저장하고 그 후 복구하는 것에 의해), 어떤 레지스터가 안전하게 무시되는지(예를 들면, 스크래치 또는 임시 레지스터들)를 정의한다.Register protection means that when moving through the execution of different sections of program code on the target device, which register contents should be protected (e.g., by storing them on the stack and then restoring them), and which registers are safely ignored ( For example, scratch or temporary registers).

예를 들면, x86 프로세서들은 8개의 레지스터들을 갖는다. x86 프로세서들에 일반적으로 이용되는 기준 호출 규약에서, 레지스터들 EAX, ECX 및 EDX은 스크래치이며 함수 호출에 걸쳐 보호될 필요는 없다. 즉, 이들은 호출자 보호된다(caller preserved). 반대로, EBX, ESP, EBP, ESI, 및 EDI는 피호출자 보호된다(callee preserved). 즉, 이러한 레지스터들은, 그들의 콘텐츠를 호출을 교차하여 저장하고 복구하기 위해, 코드의 호출된 블럭에 입력될 시 스택에 푸쉬(push)되어야 하고, 종료시 스택으로부터 팝(pop)되어야 한다. 피호출자 보호된 레지스터들은, 저장, 및 목표 코드(21)의 각 블럭의 실행 모두에 오버헤드를 부가한다. 즉, 번역기(19)는, 호출 규약에 의해 요구되는 레지스터 저장 및 복구 동작들을 수행하기 위해, 목표 코드(21)의 각 생성된 블럭의 시작과 끝에 별도의 명령들을 부가한다. 이것은, 주체 코드의 동등한 블럭을 복사한 목표 블럭의 유용한 명령들에 더해진 것이다.For example, x86 processors have eight registers. In the reference calling convention generally used for x86 processors, registers EAX, ECX and EDX are scratches and need not be protected across function calls. That is, they are caller preserved. In contrast, EBX, ESP, EBP, ESI, and EDI are callee preserved. That is, these registers must be pushed onto the stack as they enter the called block of code and popped off the stack upon exit to save and restore their contents across calls. Callee protected registers add overhead to both storage and execution of each block of target code 21. That is, the translator 19 adds separate instructions to the beginning and end of each generated block of the target code 21 to perform register store and restore operations required by the calling convention. This is in addition to the useful instructions in the target block that copied the equivalent block of subject code.

다른 프로세서들은 16, 32, 64 또는 그 이상의 레지스터들을 가지며, 따라서 특히 피호출자 보호된 레지스터들에 대한 증가된 작업 부하를 수반한다.Other processors have 16, 32, 64 or more registers, and therefore involve increased workload, especially for callee protected registers.

각 문맥 스위치가 목표 아키텍쳐에 의해 상당한 작업을 요구하며, 시간지연을 필연적으로 도입하여, 번역기(19)를 통한 주체 프로그램(17)의 실행을 느리게 만든다는 것을 알 수 있을 것이다.It will be appreciated that each context switch requires significant work by the target architecture and inevitably introduces a time delay, slowing the execution of the subject program 17 through the translator 19.

도 5는 위에서 도 2, 3, 및 4를 참조하여 기술한 예시적 실행제어 방법의 작업부하 분포를 도시한다. 번역자 실행 루프(190)는 좌측 컬럼에 도시되며, 목표 코드(21)의 실행은 우측 칼럼에 도시된다. 실행 루프(190)에 의해 수행된 작업은 X로 표시된다. 목표 코드(21)의 각 블럭(TC_A, TC_B, TC_C) 내의 작업은, 레지스터 보호 및 파라미터들의 통과와 같은, 호출 규약의 오버헤드로서 Y로 표시된다. 여기서, Z는 관련 목표 코드 블럭의 유용한 작업(오리지날 주체 프로그램(17)에서의 블럭(A, B, 또는 C)의 작업에 대응하는)을 나타낸다.FIG. 5 illustrates the workload distribution of the exemplary execution control method described above with reference to FIGS. 2, 3, and 4. The translator execution loop 190 is shown in the left column and the execution of the target code 21 is shown in the right column. The work performed by the run loop 190 is indicated by X. The work in each block TC _A , TC _B , TC _C of the target code 21 is indicated as Y as overhead of the calling protocol, such as register protection and passing of parameters. Here Z represents the useful work of the relevant target code block (corresponding to the work of the block (A, B, or C) in the original subject program 17).

도 3의 단순한 주체 프로그램을 참조하면, 블럭 B는 반복적으로 수행되지만, 도 5에 도시된 바와 같이, 블럭 B의 각 실행은, 번역기와 목표 실행 사이에서 문맥 스위치를 요구하여, 따라서, 상당한 오버헤드를 일으킨다.Referring to the simple subject program of FIG. 3, block B is performed repeatedly, but as shown in FIG. 5, each execution of block B requires a context switch between the translator and the target execution, thus, a considerable overhead. Causes

본 발명의 바람직한 실시예들에 따르면, 특히, 문맥 스위칭을 더 효율적으로 수행하고, 문맥 스위칭 동작에 관련된 오버헤드를 감소시키고, 문맥 스위칭의 발생을 감소시킴으로써, 프로그램 코드 변환시의 실행제어의 향상된 방법을 제공한다.According to preferred embodiments of the present invention, an improved method of execution control in program code conversion, in particular, by performing context switching more efficiently, reducing overhead associated with context switching operations, and reducing occurrence of context switching, to provide.

어셈블리 assembly 트램폴린Trampoline (( AssemblyAssembly TrampolineTrampoline ))

도 6은, 본 발명의 제 1 바람직한 양상에 따른 프로그램 코드 변환시의 실행을 제어하기 위한 방법의 개요도를 나타낸다.6 shows a schematic diagram of a method for controlling execution during program code conversion according to a first preferred aspect of the present invention.

도 6에 도시된 바와 같이, 번역기 실행 루프(190)는, 어셈블리 트램폴린 함수(191)를 호출하고, 그 다음, 어셈블리 트램폴린 함수(191)는 저장되어 있는 미리 번역된 목표 코드 블럭(212)(TC_A, TC_B, TC_C)의 실행을 호출한다. 대안적으로, 상기 트램폴린 함수(191)는, 저장된 주체 프로그램(17)으로부터 목표 코드(21)를 생성하기 위해 번역기 코드 생성기 함수(192)를 호출한다.As shown in FIG. 6, the translator execution loop 190 calls the assembly trampoline function 191, and the assembly trampoline function 191 then stores the pre-translated target code block 212 (TC). _A , TC _B , TC _C ) Alternatively, the trampoline function 191 calls the translator code generator function 192 to generate the target code 21 from the stored subject program 17.

도 7은 본 발명의 제 1 바람직한 양상에 따른 프로그램 코드 변환시의 작업부하 분포를 도시한다.7 shows the workload distribution in program code conversion according to the first preferred aspect of the present invention.

도 7에 도시된 바와 같이, 작업로드는 번역기(19)와 목표 코드(21) 사이에서 분포된다. 실행 루프(19)에 의해 수행된 작업은 X에 의해 표시되며, 트램폴린 함수(191)에 의해 수행되는 작업은 Y'로 표시되며, 목표 코드 블럭(212)에 의해 수행된 작업은 Z로 표시된다.As shown in FIG. 7, the workload is distributed between the translator 19 and the target code 21. The task performed by the execution loop 19 is indicated by X, the task performed by the trampoline function 191 is indicated by Y ', and the task performed by the target code block 212 is indicated by Z. .

트램폴린 함수(191)의 의사코드(pseudo-code) 예는 아래와 같다:An example pseudo-code of the trampoline function 191 is as follows:

load block_identifierload block_identifier

call block_identifiercall block_identifier

트램폴린 함수(191)의 주된 과제는 번역기 문맥과 목표 코드 문맥 사이에서 문맥 스위치를 수행하는 것이다. 본 예에서, 변수 block_identifier은 레지스터에 로드되고, 이후 레지스터 값은, 그 주소로 저장된 코드를 직접적으로 또는 간접적으로 호출하고 실행하기 위한 주소로서 사용된다. 실행된 코드는 이전에 번역된 목표 코드(212)이거나 번역기 코드 생성기 함수(192)이다.The main task of the trampoline function 191 is to perform a context switch between the translator context and the target code context. In this example, the variable block_identifier is loaded into a register, and then the register value is used as an address for directly or indirectly calling and executing code stored at that address. The executed code is the previously translated target code 212 or the translator code generator function 192.

트램폴린 함수(191)를 제공함에 있어서, 메모리에 저장되어 있는 각 목표 코드 블럭(212)의 크기를 감소시킴으로써, 즉각적인 이익을 얻을 수 있다. 즉, 각 블럭은 유용한 명령들과 최소의 오버헤드를 포함한다. 또한, 각 블럭은 생성하기에 더 작고 더 신속하여, 번역시의 작업을 감소시킨다.In providing the trampoline function 191, an immediate benefit can be obtained by reducing the size of each target code block 212 stored in memory. That is, each block contains useful instructions and minimal overhead. In addition, each block is smaller and faster to generate, reducing the work in translation.

호출 규약들(Calling conventions ( CallingCalling ConventionsConventions ))

도 8은, 프로그램 코드 변환시의 제 1 및 제 2 호출 규약들을 이용하는 바람직한 방법을 도시한다. 8 shows a preferred method of using first and second calling conventions in program code conversion.

상기 방법은, 번역기(19)의 실행시 제 1 호출 규약(71)을 적용하는 단계 및 목표 코드(21)의 실행시 제 2 호출 규약(72)을 적용하는 단계를 포함한다. The method includes applying the first calling convention 71 in the execution of the translator 19 and applying the second calling convention 72 in the execution of the target code 21.

도 8에 도시된 바와 같이, 제 1 호출 규약(71)에서, 레지스터는 제 1의 소정 역할을 할당받으며, 소정의 속성들을 가정한다. 제 2 호출 규약(72)에서, 레지스터는 제 2의 상이한 역할을 할당받으며, 상이한 제 2 세트의 속성들을 가정한다.As shown in Fig. 8, in the first calling protocol 71, a register is assigned a first predetermined role and assumes certain attributes. In the second calling convention 72, a register is assigned a second different role, assuming a second, different set of attributes.

트램폴린 함수(191)는, 제 1 호출 규약에서 제 2 호출 규약으로 호출 규약 스위치를 수행하기 위한 명령들을 포함한다. 호출 규약 스위치 명령들을 가지는 트램폴린의 의사코드 예는 아래와 같다:The trampoline function 191 includes instructions for performing a call protocol switch from the first call protocol to the second call protocol. An example pseudocode of a trampoline with call protocol switch commands is as follows:

push ebxpush ebx

push esipush esi

push edi //save callee preserved registerspush edi // save callee preserved registers

load block_identifierload block_identifier

call block_identifier//basic trampolinecall block_identifier // basic trampoline

pop edipop edi

pop esipop esi

pop ebx //restore saved registerspop ebx // restore saved registers

편리하게는, 제 1 호출 규약(71)은 목표 프로세서(13)의 명령어 집합 아키텍쳐(ISA)에 적절하게 준수된다. 이러한 제 1 호출 규약은, 피호출자 보호되는 하나 또는 그 이상의 레지스터들(15)을 정의하며, 호출자 보호되는 하나 또는 그 이상의 레지스터들(스크래치 레지스터들)을 정의한다.Conveniently, the first call protocol 71 conforms appropriately to the instruction set architecture (ISA) of the target processor 13. This first calling convention defines one or more registers 15 that are protected by the callee and one or more registers (scratch registers) that are protected by the caller.

번역기(19) 자체는, 목표 프로세서(13)의 명령어 집한 아키텍쳐에 적절한 제 1 호출 규약에 따라 동작한다. 그러나, 목표 코드(21)는, 제 2 호출 규약에 따라 번역기(19)에 의해 생성됨으로써, 제 2 호출 규약에 의해 정의되는 교대적인(alternate) 레지스터 역할들과 레지스터 보호들의 이점들을 취할 수 있다.The translator 19 itself operates in accordance with a first calling convention appropriate for the instruction collection architecture of the target processor 13. However, the target code 21 can take advantage of alternate register roles and register protections defined by the second calling convention by being generated by the translator 19 in accordance with the second calling convention.

파라미터 통과(Passing parameters ParameterParameter passingpassing ))

도 7을 다시 참조하면, 트램폴린 함수(191)는 또한, 그들의 실행시, 각 목표 코드 블럭(TC_A, TC_B, TC_C) 내부로 그리고 외부로 파라미터들을 통과시키는 편리한 메카니즘을 제공한다. 특히, 고려중인 주체 코드의 현재 블럭(예를 들면, A, B, 또는 C)을 나타내는 제 1 파라미터를 적어도 통과시키는 것이 유용하다. 또한, 주체 프로세서 상태에 관한 정보를 제공하는 하나 또는 그 이상의 제 2 파라미터들(가령 추상적인(abstract) 레지스터 뱅크(27)에 대한 포인터)을 통과시키는 것이 도움이 된다.Referring again to FIG. 7, the trampoline function 191 also provides a convenient mechanism for passing parameters into and out of each target code block TC _A , TC _B , TC _C at their execution. In particular, it is useful to pass at least a first parameter that represents the current block (eg A, B, or C) of the subject code under consideration. It is also helpful to pass one or more second parameters (such as a pointer to an abstract register bank 27) that provide information about the subject processor state.

가장 편리하게는, 파라미터는 목표 프로세서(13)의 레지스터(15)를 이용하여 각 목표 코드 블럭 내부 및/또는 외부로 통과된다. 이것은, 이러한 하나 또는 그 이상의 레지스터들(15)에 저장되어 있는 파라미터 값들을 통과시키기 위해, 적절한 시기에 제 1 및 제 2 호출 규약들(71, 72)에 선택적으로 집착함으로써 달성된다.Most conveniently, the parameters are passed into and / or out of each target code block using registers 15 of the target processor 13. This is accomplished by selectively obsessing with the first and second calling conventions 71, 72 at the appropriate time in order to pass parameter values stored in these one or more registers 15.

호출 규약 스위치 명령들과 통과하는 파라미터를 가지는 트램폴린의 의사코드의 예는 아래와 같다:An example of pseudocode of a trampoline with call protocol switch commands and passing parameters is as follows:

push ebxpush ebx

push esipush esi

push edipush edi

load block_identifier, EAXload block_identifier, EAX

push ebppush ebp

call eaxcall eax

pop ebppop ebp

pop edipop edi

pop esipop esi

pop ebxpop ebx

이 특정한 예에서, X86 프로세서를 위한 제 1 호출 규약(71)에서, 레지스터 ebp는 스택 베이스 포인터를 저장한다. 바람직한 제 2 호출 규약(72)에서는, ebp가, 제 1 호출 규약(71)의 그 레지스터에 대한 가정들과 반대인, 추상적인 레지스터 뱅크(27)에 포인터를 대신 저장한다. In this particular example, in the first calling convention 71 for the X86 processor, the register ebp stores the stack base pointer. In the preferred second call protocol 72, ebp stores the pointer instead in the abstract register bank 27, which is the opposite of the assumptions for that register in the first call protocol 71.

목표 레지스터들(15) 중 하나에 파라미터를 통과시키는 것은, 많은 이점들을 가진다. 우선, 저장/복구 동작이 회피된다. 두 번째로, 프로세서가 스택, 캐시(cache) 또는 장기 메모리에 대한 메모리 접근과 비교하여, 레지스터로부터 값을 검색하는 것이 일반적으로 훨씬 빨라진다.Passing a parameter to one of the target registers 15 has many advantages. First, the save / restore operation is avoided. Secondly, it is generally much faster for a processor to retrieve a value from a register as compared to a memory access to the stack, cache or long term memory.

프로세서 Processor 모드mode 스위칭( Switching ProcessorProcessor ModeMode SwitchingSwitching ))

선택적으로, 프로세서 모드 스위치가 각 문맥 스위치 동안 수행된다. 즉, 트램폴린 함수(191)는 프로세서 모드 스위치를 수행하기 위한 명령들을 더 포함한다.Optionally, a processor mode switch is performed during each context switch. That is, the trampoline function 191 further includes instructions for performing a processor mode switch.

한 예로서, 몇몇 프로세서들은 최하위 바이트 먼저 오더링하는 모드, 또는 최상위 바이트 먼저 오더링하는 모드 중 어느 하나로 동작할 수 있다. 프로세서 모드 스위치는, 목표 코드 실행에 적합한 최하위 바이트 먼저 오더링하는 모드 또는 최상위 바이트 먼저 오더링하는 모드를 셋팅하기 위해, 문맥 스위치 동작 동안 수행되며, 반대 모드는 번역기 실행을 위해 셋팅된다.As one example, some processors may operate in either the least significant byte first mode, or the most significant byte first mode. The processor mode switch is performed during the context switch operation to set the least significant byte first ordering mode or the most significant byte first ordering mode suitable for target code execution, and the opposite mode is set for translator execution.

특히, 목표 프로세서가 특정 주체 프로세서 모드에 대해 기록된 주체 프로그램의 요구들을 더 잘 반영하는 교대 동작 모드를 제공하는 경우에, 프로세서 모드 스위치를 수행하는 것에 실용적인 이점들이 존재한다. 당업자들에게 익숙하겠지만, 프로세서 모드를 변경하는 것은, 특정 목표 프로세서에 적절한 것으로, 모드 플래그, 또는 다른 프로세서 제어 배열을 셋팅할 일 셋트의 명령들로 성취된다.In particular, there are practical advantages to performing a processor mode switch when the target processor provides an alternate mode of operation that better reflects the needs of the subject program recorded for a particular subject processor mode. As will be familiar to those skilled in the art, changing the processor mode is accomplished with a set of instructions to set a mode flag, or other processor control arrangement, as appropriate for a particular target processor.

트램폴린Trampoline 블럭block 점핑( Jumping TrampolineTrampoline BlockBlock JumpingJumping ))

도 9는 프로그램 코드 변환시의 실행 제어 방법의 다른 바람직한 실시예를 도시한다.Fig. 9 shows another preferred embodiment of the execution control method at the time of program code conversion.

도 9를 참조하면, 목표 코드(212)의 각 블럭은, 제어를 다시 트램폴린 함수(191)로 통과시키는 명령 또는 명령들로 종료된다. 여기서, 추가적인 향상은, 목표 코드(212)의 현재 블럭(예를 들면 TC_B)으로 하여금 트램폴린 함수(191)를 통해 목표 코드의 후속 블럭(예를 들면, TC_B or TC_C)을 간접적으로 참조할 수 있도록 한다.9, each block of target code 212 ends with a command or instructions that pass control back to trampoline function 191. Here, further enhancement may cause the current block (eg TC _B ) of the target code 212 to indirectly refer to a subsequent block (eg TC _B or TC _C ) of the target code via the trampoline function 191. Do it.

도 3의 프로그램의 단순한 루프형 구조를 다시 참조하며, 도시를 위해, 우리는 블럭 B가 점프로 종료되어, 블럭 B의 처음으로 다시 돌아가던가 아니면 블럭 C로 계속되는 것을 가정한다.Referring again to the simple looped structure of the program of FIG. 3, for illustration we assume that block B ends with a jump and either goes back to the beginning of block B or continues to block C.

도 9에 도시된 바와 같이, 트램폴린 블럭 점핑은, 프로그램 실행으로 하여금 어셈블리 코드 트램폴린(191)을 통해, 그리고 실행 루프(190)로 돌아가지 않고, 현재의 블럭(TC_B)에서 다음 블럭(TC_B 또는 TC_C)으로 점프하는 것을 허용한다. 이것은 문맥 스위칭을 회피하여 목표 코드 실행 문맥 내에 남아 있음으로써, 상당한 절약효과(saving)를 나타낸다. 바람직한 실시예에서, 목표 코드(212)의 각 블럭은, 후속의 목표 코드 블럭의 코드 또는 저장된 후속 목표 코드 블럭에 대한 링크를 포함 하는 저장된 블럭 객체에 링크된 링킹 파라미터(가령 변수 "block_identifier")를 제공하는 테일(tail)을 가지고 형성된다. 트램폴린 함수(191)는 이러한 블럭 링킹 파라미터를 수신하고, 그럼으로써 목표 코드의 후속 블럭을 호출한다.As shown in FIG. 9, the trampoline block jumping causes program execution to go through the assembly code trampoline 191 and not back to the execution loop 190, but to the next block TC _{B in the} current block TC _B. Or TC _C ). This saves significant savings by remaining within the target code execution context avoiding context switching. In a preferred embodiment, each block of the target code 212 is a linking parameter (e.g. variable "block_identifier") linked to a stored block object that contains a link to a code of a subsequent target code block or a stored subsequent target code block. It is formed with a providing tail. The trampoline function 191 receives this block linking parameter, thereby calling the subsequent block of the target code.

저장된 블럭 객체의 바람직한 형태가 도 10에 도시되어 있다. 각 블럭 객체(100)는, 블럭 라벨(101)(예를 들면, "A", "B", 또는 "C"), 소스 코드(17)의 대응하는 블럭의 주체 어드레스(102), 및 목표 어드레스(103)를 포함한다. 목표 어드레스(103)는, 이용가능하다면 목표 코드 블럭(TC_A, TC_B, TC_C)를 참조하거나, 아니면 목표 코드 생성기(192)를 참조한다. 따라서, 저장된 목표 어드레스(103)로의 점프는 목표 코드 또는 코드 생성기 함수 중 어느 하나의 실행을 적절하게 허락한다.The preferred form of the stored block object is shown in FIG. Each block object 100 is a block label 101 (e.g., "A", "B", or "C"), the subject address 102 of the corresponding block of the source code 17, and the target. Address 103. The target address 103 refers to the target code blocks TC _A , TC _B , TC _C , if available, or to the target code generator 192. Thus, jumping to the stored target address 103 allows the execution of either the target code or the code generator function as appropriate.

목표 코드 점핑(Goal code jumping ( TargetTarget CodeCode JumpingJumping ))

도 11은 프로그램 코드 변환시의 실행 제어 방법의 다른 바람직한 실시예를 도시한다.Fig. 11 shows another preferred embodiment of the execution control method at the time of program code conversion.

도 11을 참조하면, 목표 코드의 각 블럭(212)은, 테일 명령 또는 트램폴린 함수(191)로 돌아가지 않고 현재 블럭(TC_B)에서 후속 블럭(TC_B또는 TC_C)으로 링크되는 명령들로 종료된다.Referring to FIG. 11, each block 212 of the target code is with instructions linked from the current block TC _B to the subsequent block TC _B or TC _C without returning to the tail instruction or the trampoline function 191. It ends.

바람직하게는, 테일은, 메모리로부터 로드된 값을 기초로 간접적인 점프를 수행하는 점프 명령을 포함한다. 목표 코드 테일의 의사 코드 예는 아래와 같다:Preferably, the tail includes a jump instruction that performs an indirect jump based on the value loaded from the memory. An example pseudo code for the target code tail is shown below:

jmp*offset(block_identifier)jmp * offset (block_identifier)

점프는, 여기서 "block_identifier"로 불리는 저장된 링킹 파라미터의 값에 따라 수행된다. 저장된 변수는, 다음에 실행될 후속 블럭 객체를 저장하는 메모리 어드레스를 가리킨다. 블럭 객체는 번역된 목표 코드를 저장하거나, 목표 코드에 대한 어드레스 포인터를 포함한다.The jump is performed according to the value of the stored linking parameter, here called "block_identifier". The stored variable points to a memory address that stores the next block object to be executed next. The block object stores the translated target code or includes an address pointer to the target code.

특별히 바람직한 실시예에서는, 링킹 파라미터가 목표 레지스터들(15) 중 하나에 제공된다. 다시, 예시적인 테일 명령은 다음과 같다:In a particularly preferred embodiment, the linking parameter is provided in one of the target registers 15. Again, an exemplary tail command is as follows:

jmp*offset (eax)jmp * offset (eax)

본 예에서, 변수 block_identifier은 x86 프로세서의 레지스터 EAX에 저장된다. 테일 명령은, 링킹 파라미터를 포함하는 레지스터 EAX를 참조하도록 수정된다. 레지스터 EAX에 저장되어 있는 메모리 어드레스로 점프하는 것은, 다음에 실행될 코드의 블럭을 나타내는 저장된 객체로 링크된다. 또한, 레지스터 EAX에서의 값은, 고려중인 주체 프로그램의 현재 블럭을 나타내기 위해 유지된다. 즉, 바람직하게는, 링킹 파라미터는, 대응하는 목표 코드의 실행시에 유용한 주체 프로그램의 블럭(A, B, 또는 C)를 나타낸다.In this example, the variable block_identifier is stored in register EAX of the x86 processor. The tail instruction is modified to reference the register EAX containing the linking parameter. Jumping to the memory address stored in the register EAX is linked to the stored object representing the block of code to be executed next. In addition, the value in register EAX is maintained to indicate the current block of the subject program under consideration. In other words, preferably, the linking parameter represents a block (A, B, or C) of the subject program useful at the time of execution of the corresponding target code.

라이트웨이트Light weight 프로파일링 체크( Profiling check ( LightweightLightweight ProfilingProfiling CheckCheck ))

도 12는 라이트웨이트 프로파일링 체크를 제공함으로써 실행제어 방법의 또 다른 바람직한 양상을 도시한다.12 illustrates another preferred aspect of the execution control method by providing a light weight profiling check.

위에서 기술한 도 9와 유사한 제 1 실시예에서, 목표 코드 실행 문맥(제 2 호출 규약 및 대응하는 프로세서 모드)에 남아 있을 것이지, 아니면 실행 루프(190)로 돌아가 번역기 문맥(제 1 호출 규약 및 제 1 프로세서 모드)을 복구할지에 대한 결정이, 트램폴린 함수(191) 내에서 이루어진다.In a first embodiment similar to FIG. 9 described above, it will remain in the target code execution context (second call protocol and corresponding processor mode), or return to execution loop 190 and the translator context (first call protocol and first). 1 processor mode) is made within the trampoline function 191.

이러한 결정은, 번역기가, 번역된 코드(21)의 수정 또는 대체를 요구할 수 있는, 주체 코드(17)의 실행 시 많은 최적화를 이룰 수 있기 때문에, 실제적으로 중요하다. 그 후, 제어를 실행 루프(190)로 되돌리는 것은 이러한 최적화 결정들이 일어날 수 있도록 한다. This decision is practically important because the translator can make many optimizations in the execution of the subject code 17, which may require modification or replacement of the translated code 21. Then, returning control to execution loop 190 allows these optimization decisions to occur.

특별히 바람직한 메카니즘은 실행 한계를 각 목표 코드 블럭(212)으로 할당한다. 즉, 각 목표 코드 블럭은, 제어를 실행 루프로 되돌리기 전에 단지 N회 실행되며, 여기서 N은 정수이다. 편리하게는, 카운터가 목표 코드 블럭(212)의 생성 시 번역기(19)에 의해 셋팅되는 문턱값 N까지 카운트한다. 문턱값 N에서 0까지 하향 카운트를 대신 함으로써 추가적인 비용절감이 이루어진다. 이것은 많은 프로세서 아키텍쳐들에서 실행하기에 더 효율적인 "0 이하인 점프" 유형의 명령("jump less than or equal to zero" type instruction)을 허락한다.A particularly preferred mechanism assigns an execution limit to each target code block 212. That is, each target code block is executed only N times before returning control to the run loop, where N is an integer. Conveniently, the counter counts up to a threshold value N set by the translator 19 upon generation of the target code block 212. Additional cost savings are achieved by substituting the down count from threshold N to zero. This allows for a "jump less than or equal to zero" type instruction that is more efficient to execute on many processor architectures.

도 10을 다시 참조하면, 바람직한 실시예에서, 목표 코드의 각 블럭(100)이 카운터 N의 현재 값을 저장하기 위한 프로파일 카운트(104)를 포함하도록 생성된다.Referring back to FIG. 10, in a preferred embodiment, each block 100 of the target code is created to include a profile count 104 for storing the current value of counter N.

라이트웨이트 프로파일링 체크는, 다음 의사코드 예에 의해 도시된 바와 같이 트램폴린 함수(191)에 제공된다:The light weight profiling check is provided to the trampoline function 191 as shown by the following pseudocode example:

push ebxpush ebx

push esipush esi

push edipush edi

load block_identifier, eaxload block_identifier, eax

push ebppush ebp

1: One:

dec offest (eax)dec offest (eax)

jz 2:jz 2:

call eaxcall eax

jmp 1jmp 1

2: 2:

pop ebppop ebp

pop edipop edi

pop esipop esi

pop ebxpop ebx

본 예에서, jz(0이라면 점프) 명령은, 카운터가 이미 0에 도달하였다면, 호출을 바이패스한다.In this example, the jz (jump if zero) instruction bypasses the call if the counter has already reached zero.

도 13은, 도 11을 참조하면서 위에서 기술된 실시예에 적절한, 라이트웨이트 프로파일링 체크의 또 다른 바람직한 양상을 도시한다.FIG. 13 illustrates another preferred aspect of the light weight profiling check, suitable for the embodiment described above with reference to FIG. 11.

여기서, 프로파일링 체크는 목표 코드 블럭(191) 내에서 수행된다. 문턱값이 셋팅되고 후속적인 반복을 통해 0까지 감소된다. 바람직하게는, 체크는 다음의 예에서와 같은 명령들과 함께 목표 코드 블럭으로 입력될 때 수행된다:Here, the profiling check is performed in the target code block 191. The threshold is set and reduced to zero through subsequent iterations. Preferably, the check is performed when entered into the target code block with instructions such as in the following example:

dec offset(block identifier)dec offset (block identifier)

jnz 1:jnz 1:

retret

1:One:

문턱값이 도달되면(예를 들면, 0에서), 제어는, 가령 문맥 스위치를 번역기 실행 루프(190)로 되돌리는 적절한 행위들을 수행하기 위해 트램폴린 함수(191)로 되돌아간다.Once the threshold is reached (eg, at zero), control returns to the trampoline function 191 to perform appropriate actions, such as returning the context switch to the translator execution loop 190.

내포된 제어 루프들(Nested control loops ( NestedNested ControlControl LoopsLoops ))

도 14는, 중첨된 제어 루프들을 이용하는 프로그램 코드 변환시의 실행제어의 다른 바람직한 방법을 도시한다.Fig. 14 shows another preferred method of execution control in program code conversion using heavy control loops.

도 14에 도시된 바와 같이, 실행 제어의 내포된 형태는 번역기 코드 생성기 함수(192)를 호출하기 위한 편리한 메카니즘을 제공한다. 여기서, 블럭 점핑은 도 9 및 11에서 기술한대로 수행된다. 따라서, 목표 코드 블럭(TC_B)의 테일은, 아직 생성되지 않은 블럭(TC_C)으로 점프를 시도한다. 그 대신, 블럭 객체 C는 방향 재지정(redirection)을 포함함으로써, 실행은 번역기 생성기 함수(192)를 호출하는 제 2 트램폴린 함수(193)로 통과하게 된다. 목표 코드 블럭(TC_C)은 그 후 대응하는 주체 코드 블럭(C)으로부터 생성된다. 트램폴린?함수(193)를 통해, 그리고 그 후 제 1 트램폴린 함수(191)를 통해 되돌아가면, 제어는 실행 루프(190)로 되돌아간다. 목표 코드 블럭(TC_C)은 그 후 제 1 트램폴린 함수(191)를 통해 호출된다. As shown in FIG. 14, the nested form of execution control provides a convenient mechanism for calling the translator code generator function 192. Here, block jumping is performed as described in FIGS. 9 and 11. Thus, the tail of the target code block TC _B attempts to jump to a block TC _{C that} has not yet been created. Instead, the block object C includes redirection, so that execution passes to the second trampoline function 193 that calls the translator generator function 192. The target code block TC _C is then generated from the corresponding subject code block C. Returning back through the trampoline function 193 and then through the first trampoline function 191, control returns to the execution loop 190. The target code block TC _C is then called via the first trampoline function 191.

제 1 및 제 2 트램폴린 함수들(191, 193)은 매우 유사하며 동일한 코드를 공유할 수 있으나, 명료성을 위해 도 14에서는 개별적으로 도시되었다.The first and second trampoline functions 191 and 193 are very similar and can share the same code, but are shown separately in FIG. 14 for clarity.

도 14는 또한, 번역기 문맥과 목표 코드 문맥 사이의 문맥 스위칭을 도시한다. 트램폴린 함수들(191, 193) 각각은, 제 1 및 제 2 호출 규약들(71, 72) 사이에서 변경되는 것을 포함하는, 목표 코드 문맥 내부로 그리고 외부로 문맥 스위치를 수행한다. 따라서, 트램폴린(191, 193)이 각각 제 1 및 제 1 문맥들로 양 다리를 걸친 형태로 도시된다. 14 also illustrates context switching between the translator context and the target code context. Each of the trampoline functions 191, 193 performs a context switch into and out of the target code context, including changing between the first and second calling conventions 71, 72. Thus, trampolines 191 and 193 are shown in a form spanning both legs in the first and first contexts, respectively.

제 2 트램폴린 함수(193)는, 코드 생성기(192)를 실행하기 위해 번역기 문맥(71) 내부 및 외부로 문맥 스위치들을 내포(nest)한다. 제 1 트램폴린 함수는 실행 루프(190)의 상위 레벨로 돌아가기 위해 오리지날 문맥 스위치를 원래 상태로 되돌린다(unpick). 제 1 및 제 2 트램폴린들(191, 193)을 통한 이중 바운스는, 파라미터들과 다른 데이터들이 스택(stack)에 저장되어, 그 다음 적절하게 올바른 순서로 복구되도록 한다. 위에서 기술된 바와 같이, 각 문맥 스위치는, LIFO 스택의 푸쉬 또는 팝을 편리하게 이용하여 레지스터 절약 또는 복구 동작들을 요구한다.The second trampoline function 193 nests context switches into and out of the translator context 71 to execute the code generator 192. The first trampoline function unpicks the original context switch to its original state to return to the higher level of execution loop 190. Double bounce through the first and second trampolines 191, 193 allows the parameters and other data to be stored on the stack, and then restored appropriately in the correct order. As described above, each context switch conveniently utilizes a push or pop of the LIFO stack to require register saving or recovery operations.

종종, 주체 프로그램(17)은, 코드의 작은 영역들이 반복적으로 실행될 것을 요구하는 반면, 주요 부분들은 이왕이면 좀처럼 실행되지 않는다. 예를 들면, 주체 프로그램은, 많은 이용가능한 함수들 또는 명령들 중 상대적으로 적은 몇몇만 자주 이용되는 스프레드쉬트 또는 워드 프로세서 프로그램이다. 따라서, 동적 이진법 번역기에서, 목표 코드 실행은 번역보다 더 자주 일어난다. 도 14의 바람직한 실행제어 방법은, 제 2 트램폴린 함수(193)에 오버헤드를 부가하지만, 번역된 코드(21)의 실행에 대해서 최적화된다.Often, the subject program 17 requires small areas of code to be executed repeatedly, while the main parts are rarely executed. For example, a subject program is a spreadsheet or word processor program that is frequently used with only a relatively few of many available functions or instructions. Thus, in a dynamic binary translator, target code execution occurs more often than translation. The preferred execution control method of FIG. 14 adds overhead to the second trampoline function 193 but is optimized for execution of the translated code 21.

위에서 설명한 양상들 각각은 분리해서 이용될 수 있다. 그러나, 중요한 이 득은 이러한 메카니즘의 상승적인 조합(synergistic combination)에 의해 성취된다. 즉, 문맥 스위치들 중 어느 하나 또는 그 이상을, 트램폴린 함수, 제 1 및 제 2 호출 규약들, 블럭 점프들, 라이트웨이트 프로파일링 체크 및 내포된 제어 루프들을 통해 조합함으로써, 중요한 절약 효과들이 번역기(19)를 통한 주체 코드(17)의 실행시 이루어진다.Each of the aspects described above may be used separately. However, significant gains are achieved by synergistic combinations of these mechanisms. That is, by combining one or more of the context switches via trampoline function, first and second calling conventions, block jumps, lightweight profiling checks, and nested control loops, significant savings can be achieved by translator ( 19 is executed upon execution of the subject code 17.

실용적인 예로서, 본 발명의 바람직한 이행들 하에, 실행은 여기서 설명된 바와 같이 향상 효과와 함께 두 배 내지 세 배 더 빠르다. 이것은 번역기(19)의 성능에 중요한 상승효과를 가져온다.As a practical example, under preferred implementations of the invention, the implementation is two to three times faster with enhancement effects as described herein. This has a significant synergistic effect on the performance of the translator 19.

많은 바람직한 실시예들이 도시되었고 설명되었지만, 당업자들은 첨부된 청구항들에서 정의된 바와 같은 본 발명의 영역에서 벗어나지 않고 다양한 변경 및 수정이 이루어질 수 있음을 알 수 있을 것이다.While many preferred embodiments have been shown and described, those skilled in the art will recognize that various changes and modifications can be made without departing from the scope of the present invention as defined in the appended claims.

본 출원과 관련해서 본 명세서와 동시에 또는 이전에 출원되며, 본 명세서와 함께 공공 열람으로 공개된 모든 논문 및 서류들에 관심이 집중이 되어야 하고, 그러한 모든 논문 및 서류들의 콘텐츠들이 여기 참조로서 통합되었다. Attention should be paid to all papers and documents filed simultaneously with or prior to this specification in connection with this application, and published publicly with this specification, the contents of all such papers and documents incorporated herein by reference. .

본 명세서(첨부된 청구항, 요약, 및 도면들을 포함한)에 게시된 모든 특징들 및/또는 그렇게 게시된 방법 또는 공정의 모든 단계들이, 그러한 특징들 및/또는 단계들의 적어도 몇몇은 상호 배타적인 조합들을 제외하고는, 어떤 조합으로도 조합될 수 있다.All features published herein (including the appended claims, summaries, and drawings) and / or all steps of a method or process so published, at least some of those features and / or steps may be mutually exclusive combinations. Except, it can be combined in any combination.

본 명세서(첨부된 청구항, 요약서, 및 도면들을 포함한)에 게시된 각 특징은, 표현적으로 다르게 진술되지 않는다면, 동일하거나, 동등하거나, 또는 유사한 목적을 달성하는 대안적인 특징들에 의해 대체될 수 있다. 따라서, 게시된 각 특징은, 표현적으로 다르게 진술되지 않는다면, 포괄적인 일련의 동등하거나 유사한 특징들의 하나의 예에 지나지 않는다.Each feature disclosed in this specification (including the appended claims, abstract, and drawings), unless expressly stated otherwise, may be replaced by alternative features that achieve the same, equivalent, or similar purpose. have. Thus, each feature disclosed is but one example of a generic series of equivalent or similar features, unless expressly stated otherwise.

본 발명은 앞에서 설명된 실시예(들)의 상세한 설명으로 제한되지 않는다. 본 발명은 본 명세서(첨부된 청구항들, 요약, 및 도면들을 포함하는)에서 게시된 특징들 중 어느 신규한 특징 또는 어느 신규한 조합으로 확장되거나, 또한 그렇게 게시된 방법 또는 과정의 단계들 중 어느 신규한 것 도는 신규한 조합으로 확장된다. The invention is not limited to the details of the embodiment (s) described above. The present invention extends to any new feature or any novel combination of features disclosed in this specification (including appended claims, abstract, and drawings), or that any of the steps of a method or process New or expand to new combinations.

Claims

주체 프로그램(subject program)을, 목표 프로세서(target processor)에 의해 실행가능한 목표 코드(target code)로의 프로그램 코드 변환시의 실행 제어 방법에 있어서, 상기 방법은:In the execution control method in converting a program program into a target code executable by a target processor, the method comprises:

(a) 상기 주체 프로그램에 현재 주체 블럭으로서 주체 블럭을 셋팅하는 번역기 실행 루프를 제공하는 단계;(a) providing a translator execution loop for setting the subject block as a current subject block in the subject program;

(b) 상기 번역기 실행 루프로부터 번역기 트램폴린(trampoline) 함수를 호출하는 단계;(b) calling a translator trampoline function from the translator run loop;

(c) 상기 번역기 실행 루프로 돌아가지 않고(without returning to the translator run loop), 상기 번역기 트램폴린 함수로부터, 상기 현재 주체 블럭으로부터 목표 코드 블럭을 생성하여 저장하기 위해 번역기 코드 생성기 함수를 호출하거나, 그렇지 않으면 상기 현재 주체 블럭에 대응하는 미리 저장된(previously stored) 목표 코드 블럭을 호출하여 실행하는 단계; 및(c) calling a translator code generator function to generate and store a target code block from the current subject block from the translator trampoline function, without returning to the translator run loop, or Otherwise calling and executing a previously stored target code block corresponding to the current subject block; And

(d) 상기 번역기 실행 루프로 돌아가지 않은 상기 실행으로부터(from the executing without returning to the translator run loop), 상기 번역기 트램폴린 함수로 돌아가서 다른 주체 블럭에 대한 단계 (c)를 반복하거나, 그렇지 않으면 상기 번역기 트램폴린 함수로부터 상기 번역기 실행 루프로 돌아가는 단계;를 특징으로 하는 실행 제어 방법.(d) from the executing without returning to the translator run loop, returning to the translator trampoline function and repeating step (c) for another subject block, or else the translator Returning from the trampoline function to the translator execution loop.

제 1 항에 있어서,The method of claim 1,

상기 번역기 실행 루프 및 상기 번역기 코드 생성기 함수의 실행을 포함하는 번역기 실행 동안, 제 1 호출 규약을 적용하는 단계; 및Applying a first calling convention during translator execution including execution of the translator execution loop and the translator code generator function; And

상기 목표 코드의 실행 동안, 제 2 호출 규약을 적용하는 단계;를 더 포함하는, 실행 제어 방법.During the execution of the target code, applying a second calling convention.

제 2 항에 있어서,The method of claim 2,

상기 제 1 호출 규약에서, 상기 목표 프로세서의 레지스터에 제 1 소정 역할 및 제 1 레지스터 보호 속성을 할당하는 단계; 및In the first calling protocol, assigning a first predetermined role and a first register protection attribute to a register of the target processor; And

상기 제 2 호출 규약에서, 상기 목표 프로세서의 상기 레지스터에 제 2 소정 역할 및 제 2 레지스터 보호 속성을 할당하는 단계;를 더 포함하는, 실행 제어 방법. And in the second calling protocol, assigning a second predetermined role and a second register protection attribute to the register of the target processor.

제 2 항에 있어서,The method of claim 2,

상기 번역기 실행과 상기 목표 코드 실행 사이에서 스위칭할 때, 상기 트램폴린 함수에서 상기 제 1 호출 규약으로부터 상기 제 2 호출 규약으로 스위치를 수행하는 단계, 또는 그 역으로 하는 단계를 더 포함하는, 실행 제어 방법.Executing a switch from said first calling protocol to said second calling protocol in the trampoline function or vice versa when switching between said translator execution and said target code execution. .

제 4 항에 있어서, 5. The method of claim 4,

상기 제 1 호출 규약은, 상기 목표 프로세서의 명령어 집합 아키텍쳐(instruction set architecture)에 적합하게 적용되는, 실행 제어 방법. And the first calling convention is suitably applied to an instruction set architecture of the target processor.

제 4 항에 있어서,5. The method of claim 4,

상기 제 1 호출 규약은, 제 1 셋트의 레지스터 역할들 및 보호 속성들을 정의하여, 피호출자 보호된 그리고 호출자 보호된 레지스터들을 포함하며; 상기 제 2 호출 규약은, 제 1 셋트와 상이한 제 2 셋트의 레지스터 역할들 및 보호 속성들을 정의하는, 실행 제어 방법.The first calling protocol defines a first set of register roles and protection attributes to include callee protected and caller protected registers; And the second calling convention defines a second set of register roles and protection attributes that are different from the first set.

제 2 항에 있어서,The method of claim 2,

상기 번역기 코드 생성기 함수를 실행할 때, 목표 프로세서의 명령어 집합 아키텍쳐에 적합한 상기 제 1 호출 규약을 적용하는 단계: 및When executing the translator code generator function, applying the first calling convention appropriate for the instruction set architecture of a target processor: and

상기 제 2 호출 규약에 따라 작동하기 위해, 상기 번역기 코드 생성기 함수에서 상기 목표 코드를 생성하는 단계;를 더 포함하는, 실행 제어 방법.Generating the target code in the translator code generator function to operate in accordance with the second calling convention.

제 1 항에 있어서,The method of claim 1,

하나 또는 그 이상의 파라미터들을, 상기 번역기 코드 생성기 함수로부터 실행되는 목표 코드의 블럭으로 또는 실행되는 목표 코드의 블럭으로부터 상기 번역기 코드 생성기 함수로 통과시키는 단계를 더 포함하는, 실행 제어 방법.Passing one or more parameters to the block of target code executed from the translator code generator function or to the translator code generator function from the block of target code executed.

제 8 항에 있어서, 9. The method of claim 8,

상기 하나 또는 그 이상의 파라미터들이, 고려중인 상기 현재 블럭을 나타내는 적어도 제 1 파라미터와, 주체 프로세서 상태에 관한 정보를 제공하는 하나 또는 그 이상의 제 2 파라미터들을 포함하는, 실행 제어 방법. And the one or more parameters comprise at least a first parameter representing the current block under consideration and one or more second parameters providing information regarding a subject processor state.

제 1 항에 있어서,The method of claim 1,

상기 목표 프로세서의 레지스터를 이용하여, 파라미터를 각 목표 코드 블럭 내부로 또는 외부로 통과시키는 단계를 더 포함하는, 실행 제어 방법. Using a register of the target processor to pass a parameter into or out of each target code block.

제 2 항에 있어서, The method of claim 2,

상기 제 2 호출 규약은, 상기 목표 프로세서의 적어도 하나의 레지스터에 파라미터 통과 역할을 할당하는, 실행 제어 방법.And the second call protocol assigns a parameter passing role to at least one register of the target processor.

제 1 항에 있어서,The method of claim 1,

상기 목표 프로세서의 모드를 스위치하기 위해, 상기 번역기 트램폴린 함수에서 프로세서 모드 스위치를 수행하는 단계를 더 포함하는, 실행 제어 방법.And performing a processor mode switch in the translator trampoline function to switch the mode of the target processor.

제 12 항에 있어서,13. The method of claim 12,

적어도 상기 번역기 코드 생성기 함수의 실행 동안, 상기 목표 프로세서를 제 1 모드로 셋팅하고, 적어도 상기 목표 코드 블럭의 실행 동안, 상기 목표 프로세서를 제 2 모드로 셋팅하는 단계를 더 포함하는, 실행 제어 방법.Setting the target processor to a first mode, at least during execution of the translator code generator function, and setting the target processor to a second mode, at least during execution of the target code block.

제 1 항에 있어서,The method of claim 1,

상기 번역기 트램폴린 함수를 통해 목표 코드의 현재 블럭에서 목표 코드의 후속 블럭으로 간접적으로 링크하는 단계를 더 포함하는, 실행 제어 방법.Indirectly linking from a current block of target code to a subsequent block of target code via the translator trampoline function.

제 14 항에 있어서,15. The method of claim 14,

목표 코드 블럭의 코드를 포함하거나 저장된 목표 코드 블럭에 대한 링크를 포함하는 저장된 블럭 객체에 링크되는 링킹 파라미터를 제공하는 테일(tail)을 가지는 목표 코드의 각 블럭을 생성하는 단계; 및Generating each block of target code having a tail that includes a code of the target code block or provides a linking parameter that is linked to a stored block object that includes a link to the stored target code block; And

상기 번역기 트램폴린 함수에서, 상기 현재 블럭으로부터 상기 블럭 링킹 파라미터를 수신함으로써, 상기 목표 코드의 후속 블럭을 호출하는 단계;를 더 포함하는, 실행 제어 방법.In the translator trampoline function, calling a subsequent block of the target code by receiving the block linking parameter from the current block.

제 1 항에 있어서,The method of claim 1,

상기 번역기 트램폴린 함수로 되돌아가지 않고, 상기 현재 목표 코드 블럭에서 목표 코드의 후속 블럭으로 간접적으로 링크되는 테일을 가진 목표 코드의 각 블럭을 형성하는 단계를 더 포함하는, 실행 제어 방법. Forming each block of target code with a tail indirectly linked from the current target code block to a subsequent block of target code without returning to the translator trampoline function.

제 16 항에 있어서,17. The method of claim 16,

상기 테일은, 메모리에 저장되어 있는 링킹 파라미터에 기초하여 간접적인 점프를 수행하는 점프 명령을 포함하는, 실행 제어 방법.The tail comprises a jump instruction for performing an indirect jump based on a linking parameter stored in a memory.

제 17 항에 있어서, The method of claim 17,

상기 링킹 파라미터는, 다음에 실행될 후속 블럭 객체를 저장하는 메모리 어드레스를 가리키며, 상기 블럭 객체는 상기 후속 블럭의 번역된 목표 코드를 저장하는, 실행 제어 방법. Wherein said linking parameter points to a memory address storing a subsequent block object to be executed next, said block object storing the translated target code of said subsequent block.

제 17 항에 있어서,The method of claim 17,

상기 링킹 파라미터는, 다음에 실행될 후속 블럭 객체를 저장하는 메모리 어드레스를 가리키며, 상기 블럭 객체는 상기 후속 목표 코드 블럭에 대한 어드레스 포인터를 포함하는, 실행 제어 방법.The linking parameter points to a memory address that stores a subsequent block object to be executed next, the block object including an address pointer to the subsequent target code block.

제 16 항에 있어서,17. The method of claim 16,

현재 목표 코드 블럭에서, 링킹 파라미터를 상기 목표 프로세서의 목표 레지스터에 저장하는 단계; 및In a current target code block, storing a linking parameter in a target register of the target processor; And

상기 목표 레지스터에 포함되어 있는 메모리 어드레스로 점핑함으로써, 다음에 실행되는 코드의 블럭을 나타내는 저장된 객체로 링크되는 단계;를 더 포함하는, 실행 제어 방법.Linking to a stored object representing a block of code to be executed next by jumping to a memory address contained in the target register.

제 1 항에 있어서,The method of claim 1,

목표 코드의 실행으로부터 제 2 트램폴린 함수를 호출하는 단계;Calling a second trampoline function from execution of the target code;

상기 제 2 트램폴린 함수로부터 내포된 번역기 함수(nested translator function)를 호출하는 단계;Calling a nested translator function from the second trampoline function;

상기 내포된 번역기 함수에서 상기 제 2 트램폴린 함수로 돌아가는 단계; 및Returning from the nested translator function to the second trampoline function; And

상기 제 2 트램폴린 함수에서 상기 목표 코드 실행으로 돌아가는 단계;를 더 포함하는, 실행 제어 방법.Returning to the target code execution in the second trampoline function.

제 21 항에 있어서,22. The method of claim 21,

상기 내포된 번역기 함수는, 상기 주체 프로그램의 대응하는 블럭으로부터 목표 코드의 블럭을 생성하기 위해, 상기 번역기 코드 생성기 함수를 포함하는, 실행 제어 방법. And the implied translator function comprises the translator code generator function to generate a block of target code from a corresponding block of the subject program.

제 22 항에 있어서,23. The method of claim 22,

상기 제 2 트램폴린 함수에서, 상기 목표 코드 실행과 상기 내포된 번역기 실행 사이에서 스위칭할 때, 상기 제 1 호출 규약에서 상기 제 2 호출 규약으로 문맥 스위치(context switch)를 수행하며, 그 역을 수행하는 단계를 더 포함하는, 실행 제어 방법. In the second trampoline function, when switching between the target code execution and the implied translator execution, perform a context switch from the first call protocol to the second call protocol, and vice versa. Further comprising steps.

제 14 항에 있어서,15. The method of claim 14,

목표 코드 실행 문맥에 남아 있을지, 아니면 번역기 문맥을 복구하여 상기 번역기 실행 루프로 되돌아 갈 것인지를 상기 번역기 트램폴린 함수 내에서 결정함으로써, 각 목표 코드 블럭에 대한 프로파일링 체크(profiling check)를 수행하는 단계를 더 포함하는, 실행 제어 방법. Performing a profiling check for each target code block by determining in the translator trampoline function whether to remain in a target code execution context or to restore a translator context to return to the translator execution loop. Further comprising, the execution control method.

제 24 항에 있어서,25. The method of claim 24,

목표 코드 블럭에 프로파일 문턱값을 할당하는 단계; 및Assigning a profile threshold to a target code block; And

상기 문턱값에 도달할 때까지, 상기 목표 코드 블럭의 각 반복된 실행을 카운트하여, 그 결과 상기 번역기 실행 루프로 제어를 되돌리는 단계;를 더 포함하는, 실행 제어 방법.Counting each repeated execution of the target code block until the threshold is reached, and as a result, returning control to the translator execution loop.

제 16 항에 있어서,17. The method of claim 16,

상기 목표 코드 실행 문맥에 남아 있을지, 아니면 상기 번역기 문맥을 복구하여 상기 번역기 트램폴린 함수로 되돌아 갈 것인지를 상기 목표 코드 블럭 내에서 결정함으로써, 목표 코드 블럭에 대한 프로파일링 체크(profiling check)를 수행하는 단계를 더 포함하는, 실행 제어 방법.Performing a profiling check on the target code block by determining in the target code block whether to remain in the target code execution context or to restore the translator context to return to the translator trampoline function. Further comprising, execution control method.

제 26 항에 있어서,27. The method of claim 26,

프로파일 문턱값을 목표 코드 블럭에 할당하는 단계; 및Assigning a profile threshold to a target code block; And

0에 도달할 때까지 상기 목표 코드 블럭의 각 반복된 실행에 대해 상기 문턱값에서부터 하향으로 카운트하여, 그 결과 상기 번역기 트램폴린 함수로 제어를 되돌리는 단계;를 포함하는, 실행 제어 방법.Counting downwards from the threshold for each repeated execution of the target code block until zero is reached, thereby returning control to the translator trampoline function.

제 1 항 내지 제 27 항 중 어느 한 항에 기재된 방법의 각 단계를 수행하기 위해, 컴퓨터에 의해 실행가능한 컴퓨터-판독가능한 코드의 형태로 상주하는 소프트웨어를 포함하는, 컴퓨터-판독가능한 저장 매체.28. A computer-readable storage medium comprising software resident in the form of computer-readable code executable by a computer for performing each step of the method of any one of claims 1 to 27.

목표 프로세서(tatget processor); 및Target processor; And

제 1 항 내지 제 27 항 중 어느 한 항의 방법을 실행하기 위한 번역기 코드를 포함하는 장치.28. An apparatus comprising translator code for executing the method of any one of claims 1 to 27.