




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
best book for computer architecture
Typology: Study notes
1 / 705
This page cannot be seen from the preview
Don't miss anything!
"Hennessy and Patterson have done it again! The 4th edition is a classic encore that has been adapted beautifully to meet the rapidly changing constraints of 'late-CMOS-era' technology. The detailed case studies of real processor products are especially educational, and the text reads so smoothly that it is difficult to put down. This book is a must-read for students and professionals alike!" —Pradip Bose, IBM
"This latest edition of Computer Architecture is sure to provide students with the architectural framework and foundation they need to become influential archi- tects of the future."
— Ravishankar Iyer, Intel Corp.
"As technology has advanced, and design opportunities and constraints have changed, so has this book. The 4th edition continues the tradition of presenting the latest in innovations with commercial impact, alongside the foundational con- cepts: advanced processor and memory system design techniques, multithreading and chip multiprocessors, storage systems, virtual machines, and other concepts. This book is an excellent resource for anybody interested in learning the architec- tural concepts underlying real commercial products."
—Gurindar Sohi, University of Wisconsin-Madison
"I am very happy to have my students study computer architecture using this fan- tastic book and am a little jealous for not having written it myself."
—Mateo Valero, UPC, Barcelona
"Hennessy and Patterson continue to evolve their teaching methods with the changing landscape of computer system design. Students gain unique insight into the factors influencing the shape of computer architecture design and the poten- tial research directions in the computer systems field."
—Dan Connors, University of Colorado at Boulder
"With this revision, Computer Architecture will remain a must-read for all com- puter architecture students in the coming decade."
—Wen-mei Hwu, University of Illinois at Urbana-Champaign
"The 4th edition of Computer Architecture continues in the tradition of providing a relevant and cutting edge approach that appeals to students, researchers, and designers of computer systems. The lessons that this new edition teaches will continue to be as relevant as ever for its readers."
—David Brooks, Harvard University
"With the 4th edition, Hennessy and Patterson have shaped Computer Architec- ture back to the lean focus that made the 1st edition an instant classic."
—Mark D. Hill, University of Wisconsin-Madison
Stanford University
University of California at Berkeley
With Contributions by Andrea C. Arpaci-Dusseau University of Wisconsin-Madison Remzi H. Arpaci-Dusseau University of Wisconsin-Madison Krste Asanovic Massachusetts Institute of Technology Robert P. Colwell R&E Colwell & Associates, Inc. Thomas M. Conte North Carolina State University
Jose Duato Universitat Politecnica de Valencia and Simula
Diana Franklin California Polytechnic State University, San Luis Obispo David Goldberg Xerox Palo Alto Research Center Wen-mei W. Hwu University of Illinois at Urbana-Champaign Norman P. Jouppi HP Labs Timothy M. Pinkston University of Southern California JohnW. Sias University of Illinois at Urbana-Champaign David A. Wood University of Wisconsin-Madison
Publisher Denise E. M. Penrose Project Manager Dusty Friedman, The Book Company In-house Senior Project Manager Brandy Lilly Developmental Editor Nate McFadden Editorial Assistant Kimberlee Honjo Cover Design Elisabeth Beller and Ross Carron Design Cover Image Richard I'Anson's Collection: Lonely Planet Images Composition Nancy Logan Text Design: Rebecca Evans & Associates Technical Illustration David Ruppe, Impact Publications Copy editor Ken Delia Penta Proofreader Jamie Thaman Indexer Nancy Ball Printer Maple-Vail Book Manufacturing Group Morgan Kaufmann Publishers is an Imprint of Elsevier 500 Sansome Street, Suite 400, San Francisco, CA 94111 This book is printed on acid-free paper. © 1990, 1996, 2003, 2007 by Elsevier, Inc. All rights reserved. Published 1990. Fourth edition 2007 Designations used by companies to distinguish their products are often claimed as trademarks or reg- istered trademarks. In all instances in which Morgan Kaufmann Publishers is aware of a claim, the product names appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and registration.
Permissions may be sought directly from Elsevier's Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail: permissions@elsevier. com. You may also complete your request on-line via the Elsevier Science homepage (http:// elsevier.com), by selecting "Customer Support" and then "Obtaining Permissions."
Library of Congress Cataloging-in-Publication Data
Hennessy, John L. Computer architecture : a quantitative approach / John L. Hennessy, David A. Patterson ; with contributions by Andrea C. Arpaci-Dusseau... [et al.]. —4th ed. p.cm. Includes bibliographical references and index. ISBN 13: 978-0-12-370490-0 (pbk. : alk. paper) ISBN 10: 0-12-370490-1 (pbk. : alk. paper) 1. Computer architecture. I. Patterson, David A. II. Arpaci-Dusseau, Andrea C. III. Title.
QA76.9A73P377 2006 004.2'2—dc
2006024358
For all information on all Morgan Kaufmann publications, visit our website at www.mkp.com or www.books.elsevier.com
Printed in the United States of America 06 07 08 09 10 5 4 3 2 1
I am honored and privileged to write the foreword for the fourth edition of this most important book in computer architecture. In the first edition, Gordon Bell, my first industry mentor, predicted the book's central position as the definitive text for computer architecture and design. He was right. I clearly remember the excitement generated by the introduction of this work. Rereading it now, with significant extensions added in the three new editions, has been a pleasure all over again. No other work in computer architecture—frankly, no other work I have read in any field—so quickly and effortlessly takes the reader from igno- rance to a breadth and depth of knowledge. This book is dense in facts and figures, in rules of thumb and theories, in examples and descriptions. It is stuffed with acronyms, technologies, trends, for- mulas, illustrations, and tables. And, this is thoroughly appropriate for a work on architecture. The architect's role is not that of a scientist or inventor who will deeply study a particular phenomenon and create new basic materials or tech- niques. Nor is the architect the craftsman who masters the handling of tools to craft the finest details. The architect's role is to combine a thorough understand- ing of the state of the art of what is possible, a thorough understanding of the his- torical and current styles of what is desirable, a sense of design to conceive a harmonious total system, and the confidence and energy to marshal this knowl- edge and available resources to go out and get something built. To accomplish this, the architect needs a tremendous density of information with an in-depth understanding of the fundamentals and a quantitative approach to ground his thinking. That is exactly what this book delivers. As computer architecture has evolved—from a world of mainframes, mini- computers, and microprocessors, to a world dominated by microprocessors, and now into a world where microprocessors themselves are encompassing all the complexity of mainframe computers—Hennessy and Patterson have updated their book appropriately. The first edition showcased the IBM 360, DEC VAX, and Intel 80x86, each the pinnacle of its class of computer, and helped introduce the world to RISC architecture. The later editions focused on the details of the 80x86 and RISC processors, which had come to dominate the landscape. This lat- est edition expands the coverage of threading and multiprocessing, virtualization
ix
X • Computer Architecture
and memory hierarchy, and storage systems, giving the reader context appropri- ate to today's most important directions and setting the stage for the next decade of design. It highlights the AMD Opteron and SUN Niagara as the best examples of the x86 and SPARC (RISC) architectures brought into the new world of multi- processing and system-on-a-chip architecture, thus grounding the art and science in real-world commercial examples. The first chapter, in less than 60 pages, introduces the reader to the taxono- mies of computer design and the basic concerns of computer architecture, gives an overview of the technology trends that drive the industry, and lays out a quan- titative approach to using all this information in the art of computer design. The next two chapters focus on traditional CPU design and give a strong grounding in the possibilities and limits in this core area. The final three chapters build out an understanding of system issues with multiprocessing, memory hierarchy, and storage. Knowledge of these areas has always been of critical importance to the computer architect. In this era of system-on-a-chip designs, it is essential for every CPU architect. Finally the appendices provide a great depth of understand- ing by working through specific examples in great detail. In design it is important to look at both the forest and the trees and to move easily between these views. As you work through this book you will find plenty of both. The result of great architecture, whether in computer design, building design or textbook design, is to take the customer's requirements and desires and return a design that causes that customer to say, "Wow, I didn't know that was possible." This book succeeds on that measure and will, I hope, give you as much pleasure and value as it has me.
- and Speculation 2.8 Exploiting ILP Using Dynamic Scheduling, Multiple Issue, - 2.9 Advanced Techniques for Instruction Delivery and Speculation - 2.10 Putting It All TogethenThe Intel Pentium - 2.11 Fallacies and Pitfalls - 2.12 Concluding Remarks - 2.13 Historical Perspective and References - Case Studies with Exercises by Robert P.Colwell - 3.1 Introduction Chapter 3 Limits on Instruction-Level Parallelism - 3.2 Studies of the Limitations of ILP
Contents • xiii
5.4 Protection:Virtual Memory and Virtual Machines 5.5 Crosscutting Issues: The Design of Memory Hierarchies 5.6 Putting It All Together: AMD Opteron Memory Hierarchy 5.7 Fallacies and Pitfalls 5.8 Concluding Remarks 5.9 Historical Perspective and References Case Studies with Exercises by Norman RJouppi
315 324 326 335 341 342 342
Chapter6 Storage Systems 6.1 Introduction 6.2 Advanced Topics in Disk Storage 6.3 Definition and Examples of Real Faults and Failures 6.4 I/O Performance, Reliability Measures, and Benchmarks 6.5 A Little Queuing Theory 6.6 Crosscutting Issues 6.7 Designing and Evaluating an I/O System—The Internet Archive Cluster 6.8 Putting It All Together: NetApp FAS6000 Filer 6.9 Fallacies and Pitfalls 6.10 Concluding Remarks 6.11 Historical Perspective and References Case Studies with Exercises by Andrea C.Arpaci-Dusseau and Remzi H.Arpaci-Dusseau
358 358 366 371 379 390
392 397 399 403 404
404
Appendix A Pipelining: Basic and Intermediate Concepts
A.1 Introduction A.2 The Major Hurdle of Pipelining—Pipeline Hazards A.3 How Is Pipelining Implemented? A.4 What Makes Pipelining Hard to Implement? A.5 Extending the MIPS Pipeline to Handle Multicycle Operations A.6 Putting It All Together:The MIPS R4000 Pipeline A.7 Crosscutting Issues A.8 Fallacies and Pitfalls A.9 Concluding Remarks A.10 Historical Perspective and References
A- A- A- A- A- A- A^ A- A- A-
Appendix 8 Instruction Set Principles and Examples
B.1 Introduction B.2 Classifying Instruction Set Architectures B.3 Memory Addressing B.4 Type and Size of Operands B.5 Operations in the Instruction Set
B- B- B- B- B-
Through four editions of this book, our goal has been to describe the basic princi- ples underlying what will be tomorrow's technological developments. Our excitement about the opportunities in computer architecture has not abated, and we echo what we said about the field in the first edition: "It is not a dreary science of paper machines that will never work. No! It's a discipline of keen intellectual interest, requiring the balance of marketplace forces to cost-performance-power, leading to glorious failures and some notable successes." Our primary objective in writing our first book was to change the way people learn and think about computer architecture. We feel this goal is still valid and important. The field is changing daily and must be studied with real examples and measurements on real computers, rather than simply as a collection of defini- tions and designs that will never need to be realized. We offer an enthusiastic welcome to anyone who came along with us in the past, as well as to those who are joining us now. Either way, we can promise the same quantitative approach to, and analysis of, real systems. As with earlier versions, we have strived to produce a new edition that will continue to be as relevant for professional engineers and architects as it is for those involved in advanced computer architecture and design courses. As much as its predecessors, this edition aims to demystify computer architecture through an emphasis on cost-performance-power trade-offs and good engineering design. We believe that the field has continued to mature and move toward the rigorous quantitative foundation of long-established scientific and engineering disciplines.
The fourth edition of Computer Architecture: A Quantitative Approach may be the most significant since the first edition. Shortly before we started this revision, Intel announced that it was joining IBM and Sun in relying on multiple proces- sors or cores per chip for high-performance designs. As the first figure in the book documents, after 16 years of doubling performance every 18 months, sin-
XV
XVI ii Preface
gle-processor performance improvement has dropped to modest annual improve- ments. This fork in the computer architecture road means that for the first time in history, no one is building a much faster sequential processor. If you want your program to run significantly faster, say, to justify the addition of new features, you're going to have to parallelize your program. Hence, after three editions focused primarily on higher performance by exploiting instruction-level parallelism (ILP), an equal focus of this edition is thread-level parallelism (TLP) and data-level parallelism (DLP). While earlier editions had material on TLP and DLP in big multiprocessor servers, now TLP and DLP are relevant for single-chip multicores. This historic shift led us to change the order of the chapters: the chapter on multiple processors was the sixth chapter in the last edition, but is now the fourth chapter of this edition. The changing technology has also motivated us to move some of the content from later chapters into the first chapter. Because technologists predict much higher hard and soft error rates as the industry moves to semiconductor processes with feature sizes 65 nm or smaller, we decided to move the basics of dependabil- ity from Chapter 7 in the third edition into Chapter 1. As power has become the dominant factor in determining how much you can place on a chip, we also beefed up the coverage of power in Chapter 1. Of course, the content and exam- ples in all chapters were updated, as we discuss below. In addition to technological sea changes that have shifted the contents of this edition, we have taken a new approach to the exercises in this edition. It is sur- prisingly difficult and time-consuming to create interesting, accurate, and unam- biguous exercises that evenly test the material throughout a chapter. Alas, the Web has reduced the half-life of exercises to a few months. Rather than working out an assignment, a student can search the Web to find answers not long after a book is published. Hence, a tremendous amount of hard work quickly becomes unusable, and instructors are denied the opportunity to test what students have learned. To help mitigate this problem, in this edition we are trying two new ideas. First, we recruited experts from academia and industry on each topic to write the exercises. This means some of the best people in each field are helping us to cre- ate interesting ways to explore the key concepts in each chapter and test the reader's understanding of that material. Second, each group of exercises is orga- nized around a set of case studies. Our hope is that the quantitative example in each case study will remain interesting over the years, robust and detailed enough to allow instructors the opportunity to easily create their own new exercises, should they choose to do so. Key, however, is that each year we will continue to release new exercise sets for each of the case studies. These new exercises will have critical changes in some parameters so that answers to old exercises will no longer apply. Another significant change is that we followed the lead of the third edition of Computer Organization and Design (COD) by slimming the text to include the material that almost all readers will want to see and moving the appendices that
As before, we have taken a conservative approach to topic selection, for there are many more interesting ideas in the field than can reasonably be covered in a treat- ment of basic principles. We have steered away from a comprehensive survey of every architecture a reader might encounter. Instead, our presentation focuses on core concepts likely to be found in any new machine. The key criterion remains that of selecting ideas that have been examined and utilized successfully enough to permit their discussion in quantitative terms. Our intent has always been to focus on material that is not available in equiva- lent form from other sources, so we continue to emphasize advanced content wherever possible. Indeed, there are several systems here whose descriptions cannot be found in the literature. (Readers interested strictly in a more basic introduction to computer architecture should read Computer Organization and Design: The Hardware/Software Interface, third edition.)
Chapter 1 has been beefed up in this edition. It includes formulas for static power, dynamic power, integrated circuit costs, reliability, and availability. We go into more depth than prior editions on the use of the geometric mean and the geo- metric standard deviation to capture the variability of the mean. Our hope is that these topics can be used through the rest of the book. In addition to the classic quantitative principles of computer design and performance measurement, the benchmark section has been upgraded to use the new SPEC2006 suite. Our view is that the instruction set architecture is playing less of a role today than in 1990, so we moved this material to Appendix B. It still uses the MIPS architecture. For fans of IS As, Appendix J covers 10 RISC architectures, the 80x86, the DEC VAX, and the IBM 360/370. Chapters 2 and 3 cover the exploitation of instruction-level parallelism in high-performance processors, including superscalar execution, branch prediction, speculation, dynamic scheduling, and the relevant compiler technology. As men- tioned earlier, Appendix A is a review of pipelining in case you need it. Chapter 3 surveys the limits of ILR New to this edition is a quantitative evaluation of multi- threading. Chapter 3 also includes a head-to-head comparison of the AMD Ath- lon, Intel Pentium 4, Intel Itanium 2, and IBM Power5, each of which has made separate bets on exploiting ILP and TLP. While the last edition contained a great deal on Itanium, we moved much of this material to Appendix G, indicating our view that this architecture has not lived up to the early claims. Given the switch in the field from exploiting only ILP to an equal focus on thread- and data-level parallelism, we moved multiprocessor systems up to Chap- ter 4, which focuses on shared-memory architectures. The chapter begins with the performance of such an architecture. It then explores symmetric and distributed-memory architectures, examining both organizational principles and performance. Topics in synchronization and memory consistency models are
Preface • xix
next. The example is the Sun Tl ("Niagara"), a radical design for a commercial product. It reverted to a single-instruction issue, 6-stage pipeline microarchitec- ture. It put 8 of these on a single chip, and each supports 4 threads. Hence, soft- ware sees 32 threads on this single, low-power chip. As mentioned earlier, Appendix C contains an introductory review of cache principles, which is available in case you need it. This shift allows Chapter 5 to start with 11 advanced optimizations of caches. The chapter includes a new sec- tion on virtual machines, which offers advantages in protection, software man- agement, and hardware management. The example is the AMD Opteron, giving both its cache hierarchy and the virtual memory scheme for its recently expanded 64-bit addresses. Chapter 6, "Storage Systems," has an expanded discussion of reliability and availability, a tutorial on RAID with a description of RAID 6 schemes, and rarely found failure statistics of real systems. It continues to provide an introduction to queuing theory and I/O performance benchmarks. Rather than go through a series of steps to build a hypothetical cluster as in the last edition, we evaluate the cost, performance, and reliability of a real cluster: the Internet Archive. The "Putting It All Together" example is the NetApp FAS6000 filer, which is based on the AMD Opteron microprocessor. This brings us to Appendices A through L. As mentioned earlier, Appendices A and C are tutorials on basic pipelining and caching concepts. Readers relatively new to pipelining should read Appendix A before Chapters 2 and 3, and those new to caching should read Appendix C before Chapter 5. Appendix B covers principles of ISAs, including MIPS64, and Appendix J describes 64-bit versions of Alpha, MIPS, PowerPC, and SPARC and their multi- media extensions. It also includes some classic architectures (80x86, VAX, and IBM 360/370) and popular embedded instruction sets (ARM, Thumb, SuperH, MIPS 16, and Mitsubishi M32R). Appendix G is related, in that it covers architec- tures and compilers for VLIW ISAs. Appendix D, updated by Thomas M. Conte, consolidates the embedded mate- rial in one place. Appendix E, on networks, has been extensively revised by Timothy M. Pink- ston and Jose Duato. Appendix F, updated by Krste Asanovic, includes a descrip- tion of vector processors. We think these two appendices are some of the best material we know of on each topic. Appendix H describes parallel processing applications and coherence proto- cols for larger-scale, shared-memory multiprocessing. Appendix I, by David Goldberg, describes computer arithmetic. Appendix K collects the "Historical Perspective and References" from each chapter of the third edition into a single appendix. It attempts to give proper credit for the ideas in each chapter and a sense of the history surrounding the inventions. We like to think of this as presenting the human drama of computer design. It also supplies references that the student of architecture may want to pursue. If you have time, we recommend reading some of the classic papers in the field that are mentioned in these sections. It is both enjoyable and educational