The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
단일 애플리케이션 또는 애플리케이션 클래스가 프로세서에서 반복적으로 실행되는 임베디드 시스템에서는 최적의 캐시 구성이 달성되도록 캐시 구성을 사용자 정의할 수 있습니다. 세 가지 캐시 매개변수인 세트 수, 라인 크기, 연관성을 변경하여 전체 메모리 액세스 시간을 최소화하는 최적의 캐시 구성을 가질 수 있습니다. 본 논문에서는 먼저 두 가지 캐시 시뮬레이션 알고리즘인 CRCB1과 CRCB2를 제안합니다. 캐시 포함 속성. 정확한 캐시 시뮬레이션을 구현하지만 캐시 적중/실패 판단 횟수를 대폭 줄입니다. 우리는 실험적 관찰을 바탕으로 세 가지 캐시 디자인 공간 탐색 알고리즘인 CRMF1, CRMF2 및 CRMF3을 추가로 제안합니다. 그들은 관점에서 거의 최적의 캐시 구성을 찾을 수 있습니다. 액세스 시간. 우리의 접근 방식을 사용하면 캐시 구성 최적화에 필요한 캐시 적중/실패 판단 횟수가 기존 접근 방식에 비해 1/10-1/50으로 줄어듭니다. 결과적으로 우리가 제안한 접근 방식은 지금까지 제안된 가장 빠른 접근 방식에 비해 전체적으로 평균 3.2배, 최대 5.3배 더 빠르게 실행됩니다. 우리가 제안한 캐시 시뮬레이션 접근 방식은 총 메모리 액세스 시간을 최적화할 때 세계에서 가장 빠른 캐시 설계 공간 탐색을 달성합니다.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Nobuaki TOJO, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, "An L1 Cache Design Space Exploration System for Embedded Applications" in IEICE TRANSACTIONS on Fundamentals,
vol. E92-A, no. 6, pp. 1442-1453, June 2009, doi: 10.1587/transfun.E92.A.1442.
Abstract: In an embedded system where a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: the number of sets, a line size, and an associativity. In this paper, we first propose two cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but decrease the number of cache hit/miss judgments dramatically. We further propose three more cache design space exploration algorithms: CRMF1, CRMF2, and CRMF3, based on our experimental observations. They can find an almost optimal cache configuration from the viewpoint of access time. By using our approach, the number of cache hit/miss judgments required for optimizing cache configurations is reduced to 1/10-1/50 compared to conventional approaches. As a result, our proposed approach totally runs an average of 3.2 times faster and a maximum of 5.3 times faster compared to the fastest approach proposed so far. Our proposed cache simulation approach achieves the world fastest cache design space exploration when optimizing total memory access time.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E92.A.1442/_p
부
@ARTICLE{e92-a_6_1442,
author={Nobuaki TOJO, Nozomu TOGAWA, Masao YANAGISAWA, Tatsuo OHTSUKI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={An L1 Cache Design Space Exploration System for Embedded Applications},
year={2009},
volume={E92-A},
number={6},
pages={1442-1453},
abstract={In an embedded system where a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: the number of sets, a line size, and an associativity. In this paper, we first propose two cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but decrease the number of cache hit/miss judgments dramatically. We further propose three more cache design space exploration algorithms: CRMF1, CRMF2, and CRMF3, based on our experimental observations. They can find an almost optimal cache configuration from the viewpoint of access time. By using our approach, the number of cache hit/miss judgments required for optimizing cache configurations is reduced to 1/10-1/50 compared to conventional approaches. As a result, our proposed approach totally runs an average of 3.2 times faster and a maximum of 5.3 times faster compared to the fastest approach proposed so far. Our proposed cache simulation approach achieves the world fastest cache design space exploration when optimizing total memory access time.},
keywords={},
doi={10.1587/transfun.E92.A.1442},
ISSN={1745-1337},
month={June},}
부
TY - JOUR
TI - An L1 Cache Design Space Exploration System for Embedded Applications
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1442
EP - 1453
AU - Nobuaki TOJO
AU - Nozomu TOGAWA
AU - Masao YANAGISAWA
AU - Tatsuo OHTSUKI
PY - 2009
DO - 10.1587/transfun.E92.A.1442
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E92-A
IS - 6
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - June 2009
AB - In an embedded system where a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: the number of sets, a line size, and an associativity. In this paper, we first propose two cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but decrease the number of cache hit/miss judgments dramatically. We further propose three more cache design space exploration algorithms: CRMF1, CRMF2, and CRMF3, based on our experimental observations. They can find an almost optimal cache configuration from the viewpoint of access time. By using our approach, the number of cache hit/miss judgments required for optimizing cache configurations is reduced to 1/10-1/50 compared to conventional approaches. As a result, our proposed approach totally runs an average of 3.2 times faster and a maximum of 5.3 times faster compared to the fastest approach proposed so far. Our proposed cache simulation approach achieves the world fastest cache design space exploration when optimizing total memory access time.
ER -