The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
인터넷 및 데이터 센터 트래픽이 폭발적으로 증가하면서 지난 수십 년 동안 네트워크 라우터/스위치의 성능이 크게 향상되었습니다. 라우터의 성능은 메모리 시스템(예: DRAM 기반 패킷 버퍼)에 크게 좌우되며 이는 종종 라우터의 확장성을 제한합니다. 그러나 메모리 I/O 버스와 메모리 셀 어레이 속도 간의 격차가 커지고 채널 및 뱅크 증가로 인한 행 버퍼 지역성 감소로 인해 DDR4 또는 HBM2 DRAM과 같은 최첨단 메모리 기술로 얻을 수 있는 성능 이점이 심각하게 감소합니다. 이전 연구에서는 DRAM 기반 패킷 버퍼를 지원하기 위해 메모리 컨트롤러에서 SRAM 기반 큐별 또는 뱅크별 입력/출력 버퍼를 유지함으로써 메모리 대역폭을 개선했습니다. 버퍼는 뱅크 충돌이 발생할 때 패킷을 일시적으로 저장하지만 간섭을 유발하는 트래픽이 DRAM의 행 버퍼를 스래싱하는 것을 방지할 수는 없습니다. 본 연구에서는 SRAM을 DRAM 기반 패킷 버퍼에 직접 통합하고 DRAM의 행 버퍼 위치를 저하시키는 패킷을 SRAM에 매핑합니다. 이는 DRAM 액세스의 지역성과 병렬성을 최대화합니다. 제안된 방식은 기존 방식에 도움이 될 수 있습니다. 실험 결과는 가혹한 혼잡 시나리오에서 메모리 대역폭 활용 측면에서 단일 채널에 대한 기존 최고의 방식에 비해 22.41% 향상된 것으로 나타났습니다.
Yongwoon SONG
Sogang University
Dongkeon CHOI
Sogang University
Hyukjun LEE
Sogang University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
부
Yongwoon SONG, Dongkeon CHOI, Hyukjun LEE, "Designing a High Performance SRAM-DRAM Hybrid Memory Architecture for Packet Buffers" in IEICE TRANSACTIONS on Electronics,
vol. E102-C, no. 12, pp. 849-852, December 2019, doi: 10.1587/transele.2019ECS6003.
Abstract: The performance of a network router/switch has improved significantly over past decades with explosively increasing internet and data center traffic. The performance of a router heavily depends on the memory system, e.g. DRAM based packet buffers, which often limits the scalability of a router. However, a widening gap between memory I/O bus and memory cell array speed and decreasing row buffer locality from increasing channels and banks severely reduce the performance gain from state-of-the-art memory technology such as DDR4 or HBM2 DRAM. Prior works improved memory bandwidth by maintaining SRAM-based per-queue or per-bank input/output buffers in the memory controller to support a DRAM-based packet buffer. The buffers temporarily store packets when bank conflicts occur but are unable to prevent interference-inducing traffic from thrashing DRAM's row buffers. In this study, we directly integrate SRAM into the DRAM-based packet buffer and map those packets degrading row buffer locality of DRAM into SRAM. This maximizes locality and parallelism of DRAM accesses. The proposed scheme can benefit any existing schemes. Experimental results show 22.41% improvement over the best existing scheme for a single channel in terms of the memory bandwidth utilization under harsh congested scenarios.
URL: https://global.ieice.org/en_transactions/electronics/10.1587/transele.2019ECS6003/_p
부
@ARTICLE{e102-c_12_849,
author={Yongwoon SONG, Dongkeon CHOI, Hyukjun LEE, },
journal={IEICE TRANSACTIONS on Electronics},
title={Designing a High Performance SRAM-DRAM Hybrid Memory Architecture for Packet Buffers},
year={2019},
volume={E102-C},
number={12},
pages={849-852},
abstract={The performance of a network router/switch has improved significantly over past decades with explosively increasing internet and data center traffic. The performance of a router heavily depends on the memory system, e.g. DRAM based packet buffers, which often limits the scalability of a router. However, a widening gap between memory I/O bus and memory cell array speed and decreasing row buffer locality from increasing channels and banks severely reduce the performance gain from state-of-the-art memory technology such as DDR4 or HBM2 DRAM. Prior works improved memory bandwidth by maintaining SRAM-based per-queue or per-bank input/output buffers in the memory controller to support a DRAM-based packet buffer. The buffers temporarily store packets when bank conflicts occur but are unable to prevent interference-inducing traffic from thrashing DRAM's row buffers. In this study, we directly integrate SRAM into the DRAM-based packet buffer and map those packets degrading row buffer locality of DRAM into SRAM. This maximizes locality and parallelism of DRAM accesses. The proposed scheme can benefit any existing schemes. Experimental results show 22.41% improvement over the best existing scheme for a single channel in terms of the memory bandwidth utilization under harsh congested scenarios.},
keywords={},
doi={10.1587/transele.2019ECS6003},
ISSN={1745-1353},
month={December},}
부
TY - JOUR
TI - Designing a High Performance SRAM-DRAM Hybrid Memory Architecture for Packet Buffers
T2 - IEICE TRANSACTIONS on Electronics
SP - 849
EP - 852
AU - Yongwoon SONG
AU - Dongkeon CHOI
AU - Hyukjun LEE
PY - 2019
DO - 10.1587/transele.2019ECS6003
JO - IEICE TRANSACTIONS on Electronics
SN - 1745-1353
VL - E102-C
IS - 12
JA - IEICE TRANSACTIONS on Electronics
Y1 - December 2019
AB - The performance of a network router/switch has improved significantly over past decades with explosively increasing internet and data center traffic. The performance of a router heavily depends on the memory system, e.g. DRAM based packet buffers, which often limits the scalability of a router. However, a widening gap between memory I/O bus and memory cell array speed and decreasing row buffer locality from increasing channels and banks severely reduce the performance gain from state-of-the-art memory technology such as DDR4 or HBM2 DRAM. Prior works improved memory bandwidth by maintaining SRAM-based per-queue or per-bank input/output buffers in the memory controller to support a DRAM-based packet buffer. The buffers temporarily store packets when bank conflicts occur but are unable to prevent interference-inducing traffic from thrashing DRAM's row buffers. In this study, we directly integrate SRAM into the DRAM-based packet buffer and map those packets degrading row buffer locality of DRAM into SRAM. This maximizes locality and parallelism of DRAM accesses. The proposed scheme can benefit any existing schemes. Experimental results show 22.41% improvement over the best existing scheme for a single channel in terms of the memory bandwidth utilization under harsh congested scenarios.
ER -