E-Book

E-Book, Englisch, Band Volume 92, 288 Seiten

Reihe: Advances in Computers

Namasudra Advances in Computers

1. Auflage 2014
ISBN: 978-0-12-799933-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark

Häufig gestellte Fragen zu E-Books

E-Book, Englisch, Band Volume 92, 288 Seiten

Reihe: Advances in Computers

Advances in Computers
Erscheinungsjahr 2014, 978-0-12-420232-0, Buch

Keine Vorlesefunktionen des Lesesystems deaktiviert (bis auf)
Navigierbares Inhaltsverzeichnis
Navigierbarer Index
Logische Lesereihenfolge eingehalten
Mathematische Inhalte in MathML
Seitennummerierung folgt dem gedruckten Werk
E-Mail eines vertrauenswürdigen Mittlers bei Fragen zur Barrierefreiheit: t.narup@elsevier.com
E-Mail des Verlages bei Fragen zur Barrierefreiheit: t.narup@elsevier.com

E-Book, Englisch, Band Volume 92, 288 Seiten

Reihe: Advances in Computers

ISBN: 978-0-12-799933-3
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark

Häufig gestellte Fragen zu E-Books

137,00 €

(inkl. MwSt.)

versandkostenfreie Lieferung
sofort verfügbar

Since its first volume in 1960, Advances in Computers has presented detailed coverage of innovations in computer hardware, software, theory, design, and applications. It has also provided contributors with a medium in which they can explore their subjects in greater depth and breadth than journal articles usually allow. As a result, many articles have become standard references that continue to be of significant, lasting value in this rapidly expanding field. - In-depth surveys and tutorials on new computer technology - Well-known authors and researchers in the field - Extensive bibliographies with most chapters - Many of the volumes are devoted to single themes or subfields of computer science

Namasudra Advances in Computers jetzt bestellen!

Autoren/Hrsg.

Namasudra, Suyel

Weitere Infos & Material

Leseproben

Chapter Two

Survey on System I/O Hardware Transactions and Impact on Latency, Throughput, and Other Factors

Steen Larsen*,† and Ben Lee*, *School of Electrical and Engineering Computer Science, Oregon State University, Corvallis, Oregon, USA, †Intel Corporation, Hillsboro, Oregon, USA

Abstract

Computer system input/output (I/O) has evolved with processor and memory technologies in terms of reducing latency, increasing bandwidth, and other factors. As requirements increase for I/O, such as networking, storage, and video, descriptor-based direct memory access (DMA) transactions have become more important in high-performance systems to move data between I/O adapters and system memory buffers. DMA transactions are done with hardware engines below the software protocol abstraction layers in all systems other than rudimentary embedded controllers. Central processing unit (CPUs) can switch to other tasks by offloading hardware DMA transfers to the I/O adapters. Each I/O interface has one or more separately instantiated descriptor-based DMA engines optimized for a given I/O port. I/O transactions are optimized by accelerator functions to reduce latency, improve throughput, and reduce CPU overhead. This chapter surveys the current state of high-performance I/O architecture advances and explores benefits and limitations. With the proliferation of CPU multicores within a system, multi-GB/s ports, and on-die integration of system functions, changes beyond the techniques surveyed may be needed for optimal I/O architecture performance.

Keywords

Input/output; Processors; Controllers; Memory; DMA; Latency; Throughput; Power

Abbreviations

ARM Acorn RISC Machine

BIOS basic input/output system—allows access by the operating system to low-level hardware

BW bandwidth supported by an interface, usually synonymous with throughput capability

CNI coherent network interface

CPU central processing unit—consisting of potentially multiple cores, each with one or more hardware threads of execution

CRC cyclic redundancy check

CQE completion queue entry—used in RDMA to track transaction completions

DCA direct cache access

DDR double data rate—allows a slower clock to transmit twice the data per cycle. Usually based on both the rising and falling edge of a clock signal

DDR3 3rd generation memory DDR interface

DLP data layer protocol in PCIe, which is similar to networking IP layer

DMA direct memory access—allows read or write transactions with system memory

DSP digital signal processing

FPGA field-programmable gate array

FSB front-side bus—a processor interface protocol that is replaced by Intel QPI and AMD HyperTransport

GbE gigabit Ethernet

GBps gigabytes per second

Gbps gigabits per second (GBps x8)

GHz gigahertz

GOQ global observation queue

GPU graphic processing unit

HPC high-performance computing—usually implies a high-speed interconnection of high-performance systems

HW hardware

ICH Intel I/O controller hub—interfaced to the IOH to support slower system protocols, such as USB and BIOS memory

I/O input/output

IOH Intel I/O hub—interfaces between QPI and PCIe interfaces

iWARP Internet wide area RDMA protocol—an RDMA protocol that supports lower level Ethernet protocol transactions

kB kilobyte, 1024 bytes. Sometimes reduced to “K” based on context

L1 cache level 1 cache

L2 cache level 2 cache

LCD liquid crystal display

LLC last-level cache—level 3 cache

LLI low latency interrupt

LLP link layer protocol—used PCIe

LRO large receive offloading

LSO large segment offload

MB megabytes

MESI(F) modified, exclusive, shared, invalid, and optionally forward—protocol to maintain memory coherency between different CPUs in a system

MFC memory flow controller—used to manage SPU DMA transactions

MMIO memory-mapped I/O

MPI message passing interface—a protocol to pass messages between systems often used in HPC

MSI message signaled interrupt—used in PCIe to interrupt a core

MTU maximum transmission unit

NIC network interface controller

NUMA nonuniform memory architecture—allows multiple pools of memory to be shared between CPUs with a coherency protocol

PCIe Peripheral Component Interconnect express—defined at www.pcisig.com. Multiple lanes (1–16) of serial I/O traffic reaching 16 Gbps per lane. Multiple generations of PCIe exist, represented by Gen1, Gen2, Gen3, and Gen4. PCIe protocol levels have similarities with networking ISO stack

PHY PHYsical interface defining the cable (fiber/copper) interfacing protocol

PIO programmed I/O—often synonymous with MMIO

QDR quad data rate—allows four times the data rate based on a slower clock frequency

QoS quality of service—a metric to define guaranteed minimums of service quality

QP queue pair—transmit queue and receive queue structure in RDMA to allow interfacing between two or more systems

QPI QuickPath Interconnect—Intel's proprietary CPU interface supporting MESI(F) memory coherence protocol

RAID redundant array of independent disks

RDMA remote direct memory access—used to access memory between two or more systems

RSS receive side scaling

RTOS real-time operating system

RX reception from a network to a system

SAS storage array system

SCC single-chip cloud

SCSI small computer system interface

SMT simultaneous multithreading

SPE synergistic processing element in the cell processor

SPU synergistic processing unit in cell SPE

SSD solid-stated disk

SW software

TCP/IP transmission control protocol and Internet protocol networking stack

TLP transaction layer protocol of PCIe stack

TOE TCP/IP offload engine

TX transmission from a system to a network

USB universal serial bus

WQE work queue entry—used in RDMA to track transaction parameters

1 Introduction

Input/output (I/O) is becoming a peer to processor core (or simply ) and memory in terms of latency, bandwidth, and power requirements. Historically, when a core was simpler and more directly I/O focused, it was acceptable to “bit-bang” I/O port operations using port I/O or memory-mapped I/O (MMIO) models [1]. However, with complex user interfaces and programs using multiple processes, the benefit of offloading data movement to an I/O adapter became more apparent. Since I/O devices are much slower than the core–memory bandwidth, it makes sense to move data at a pace governed by the external device.

Typically, I/O data transfer is initiated using a descriptor containing the physical address and size of the data to be moved. This descriptor is then posted (i.e., sent) to the I/O adapter, which then processes the direct memory access (DMA) read/write operations as fast as the core–memory bandwidth allows. The descriptor-based DMA approach makes sense when the I/O bandwidth requirements are much lower than the...

Produktsicherheit

Fragen zum Artikel?

Ihre Fragen, Wünsche oder Anmerkungen

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Kundennr.

Ihre Nachricht*

Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.

Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.

137,00 € (inkl. MwSt.)

sofort verfügbar

Webcode: sack.de/hhagm