The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
게이트 지연 대신 와이어 지연이 최신 VLSI 설계에서 지배적인 위치로 이동하고 있습니다. 현재 동기 프로세서는 ALU 기능이 아닌 캐시 액세스에 중요한 경로를 갖고 있습니다. 캐시 성능 향상은 주로 와이어 지연으로 구성된 메모리 액세스 지연에 의해 제한되므로 게이트 지연의 감소는 더 이상 프로세서 성능의 향상을 의미하지 않을 수 있습니다. 이 문제를 해결하기 위해 본 논문에서는 Cascade ALU라는 새로운 아키텍처를 제시합니다. Cascade ALU를 사용하면 미래 기술을 갖춘 슈퍼 스칼라 프로세서가 중요한 경로를 ALU 부분으로 이동할 수 있습니다. 따라서 Cascade ALU는 향후 장치 속도에서 예상되는 발전을 누릴 수 있습니다. Cascade ALU의 지연은 실행되는 명령어에 따라 달라지므로 비동기식 시스템이 Cascade ALU 구현에 적합한 것으로 나타났습니다. 그러나 비동기식 시스템에는 큰 핸드셰이크 오버헤드가 있을 수 있으므로 이 문서에서는 핸드셰이크 오버헤드를 숨기는 비동기식 Fine Grain Pipeline 기술도 제시합니다. 마지막으로 본 논문에서는 캐스케이드 ALU의 비동기 구현에 대한 성능 및 면적 평가 결과를 제시합니다. 결과는 캐스케이드 ALU 아키텍처가 ALU 대기 시간 감소에 있어 우수한 성능 확장성을 가지며 현재 동기식 프로세서에 비해 면적 패널티가 거의 없음을 보여줍니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Motokazu OZAWA, Masashi IMAI, Yoichiro UENO, Hiroshi NAKAMURA, Takashi NANYA, "A Cascade ALU Architecture for Asynchronous Super-Scalar Processors" in IEICE TRANSACTIONS on Electronics,
vol. E84-C, no. 2, pp. 229-237, February 2001, doi: .
Abstract: Wire delays, instead of gate delays, are moving into dominance in modern VLSI design. Current synchronous processors have the critical path not in the ALU function but in the cache access. Since the cache performance enhancement is limited by the memory access delay which mainly consists of wire delays, a reduction in gate delays may no longer imply any enhancement in processor performance. To solve this problem, this paper presents a novel architecture, called the Cascade ALU. The Cascade ALU allows super-scalar processors with future technologies to move the critical path into the ALU part. Therefore the Cascade ALU can enjoy the expected progress in future device speed. Since the delay of the Cascade ALU varies depending on the executed instructions, an asynchronous system is shown to be suitable for implementing the Cascade ALU. However an asynchronous system may have a large handshake overhead, this paper also presents an asynchronous Fine Grain Pipeline technique that hides the handshake overhead. Finally, this paper presents results of performance and area evaluation for an asynchronous implementation of the cascade ALU. The results show that the cascade ALU architecture has a good performance scalability on the reduction of the ALU latency and imposes little area penalty compared with current synchronous processors.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/e84-c_2_229/_p
부
@ARTICLE{e84-c_2_229,
author={Motokazu OZAWA, Masashi IMAI, Yoichiro UENO, Hiroshi NAKAMURA, Takashi NANYA, },
journal={IEICE TRANSACTIONS on Electronics},
title={A Cascade ALU Architecture for Asynchronous Super-Scalar Processors},
year={2001},
volume={E84-C},
number={2},
pages={229-237},
abstract={Wire delays, instead of gate delays, are moving into dominance in modern VLSI design. Current synchronous processors have the critical path not in the ALU function but in the cache access. Since the cache performance enhancement is limited by the memory access delay which mainly consists of wire delays, a reduction in gate delays may no longer imply any enhancement in processor performance. To solve this problem, this paper presents a novel architecture, called the Cascade ALU. The Cascade ALU allows super-scalar processors with future technologies to move the critical path into the ALU part. Therefore the Cascade ALU can enjoy the expected progress in future device speed. Since the delay of the Cascade ALU varies depending on the executed instructions, an asynchronous system is shown to be suitable for implementing the Cascade ALU. However an asynchronous system may have a large handshake overhead, this paper also presents an asynchronous Fine Grain Pipeline technique that hides the handshake overhead. Finally, this paper presents results of performance and area evaluation for an asynchronous implementation of the cascade ALU. The results show that the cascade ALU architecture has a good performance scalability on the reduction of the ALU latency and imposes little area penalty compared with current synchronous processors.},
keywords={},
doi={},
ISSN={},
month={February},}
부
TY - JOUR
TI - A Cascade ALU Architecture for Asynchronous Super-Scalar Processors
T2 - IEICE TRANSACTIONS on Electronics
SP - 229
EP - 237
AU - Motokazu OZAWA
AU - Masashi IMAI
AU - Yoichiro UENO
AU - Hiroshi NAKAMURA
AU - Takashi NANYA
PY - 2001
DO -
JO - IEICE TRANSACTIONS on Electronics
SN -
VL - E84-C
IS - 2
JA - IEICE TRANSACTIONS on Electronics
Y1 - February 2001
AB - Wire delays, instead of gate delays, are moving into dominance in modern VLSI design. Current synchronous processors have the critical path not in the ALU function but in the cache access. Since the cache performance enhancement is limited by the memory access delay which mainly consists of wire delays, a reduction in gate delays may no longer imply any enhancement in processor performance. To solve this problem, this paper presents a novel architecture, called the Cascade ALU. The Cascade ALU allows super-scalar processors with future technologies to move the critical path into the ALU part. Therefore the Cascade ALU can enjoy the expected progress in future device speed. Since the delay of the Cascade ALU varies depending on the executed instructions, an asynchronous system is shown to be suitable for implementing the Cascade ALU. However an asynchronous system may have a large handshake overhead, this paper also presents an asynchronous Fine Grain Pipeline technique that hides the handshake overhead. Finally, this paper presents results of performance and area evaluation for an asynchronous implementation of the cascade ALU. The results show that the cascade ALU architecture has a good performance scalability on the reduction of the ALU latency and imposes little area penalty compared with current synchronous processors.
ER -