Data Loading...

ComputerOrganizationAndEmbeddedSystems.TT Flipbook PDF

ComputerOrganizationAndEmbeddedSystems.TT


238 Views
103 Downloads
FLIP PDF 508.79KB

DOWNLOAD FLIP

REPORT DMCA

This page intentionally left blank

COMPUTER ORGANIZATION AND EMBEDDED SYSTEMS

This page intentionally left blank

COMPUTER ORGANIZATION AND EMBEDDED SYSTEMS SIXTH EDITION

Carl Hamacher Queen’s University

Zvonko Vranesic University of Toronto

Safwat Zaky University of Toronto

Naraig Manjikian Queen’s University

COMPUTER ORGANIZATION AND EMBEDDED SYSTEMS, SIXTH EDITION Published by McGraw-Hill, a business unit of The McGraw-Hill Companies, Inc., 1221 Avenue of the Americas, New York, NY 10020. Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved. Previous editions 2002, 1996, and 1990. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of The McGraw-Hill Companies, Inc., including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning. Some ancillaries, including electronic and print components, may not be available to customers outside the United States. This book is printed on acid-free paper. 1 2 3 4 5 6 7 8 9 DOC/DOC 0 9 8 7 6 5 4 3 2 1 ISBN 978–0–07–338065–0 MHID 0–07–338065–2 Vice President & Editor-in-Chief: Marty Lange Vice President EDP/Central Publishing Services: Kimberly Meriwether David Publisher: Raghothaman Srinivasan Senior Sponsoring Editor: Peter E. Massar Developmental Editor: Darlene M. Schueller Senior Marketing Manager: Curt Reynolds Senior Project Manager: Lisa A. Bruflodt Buyer: Laura Fuller Design Coordinator: Brenda A. Rolwes Media Project Manager: Balaji Sundararaman Cover Design: Studio Montage, St. Louis, Missouri Cover Image: © Royalty-Free/CORBIS Compositor: Techsetters, Inc. Typeface: 10/12 Times Roman Printer: R. R. Donnelley & Sons Company/Crawfordsville, IN

Library of Congress Cataloging-in-Publication Data Computer organization and embedded systems / Carl Hamacher ... [et al.]. – 6th ed. p. cm. Includes bibliographical references. ISBN-13: 978-0-07-338065-0 (alk. paper) ISBN-10: 0-07-338065-2 (alk. paper) 1. Computer organization. 2. Embedded computer systems. I. Hamacher, V. Carl. QA76.9.C643.H36 2012 004.2'2–dc22 2010050243

www.mhhe.com

To our families

This page intentionally left blank

About the Authors Carl Hamacher received the B.A.Sc. degree in Engineering Physics from the University of Waterloo, Canada, the M.Sc. degree in Electrical Engineering from Queen’s University, Canada, and the Ph.D. degree in Electrical Engineering from Syracuse University, New York. From 1968 to 1990 he was at the University of Toronto, Canada, where he was a Professor in the Department of Electrical Engineering and the Department of Computer Science. He served as director of the Computer Systems Research Institute during 1984 to 1988, and as chairman of the Division of Engineering Science during 1988 to 1990. In 1991 he joined Queen’s University, where is now Professor Emeritus in the Department of Electrical and Computer Engineering. He served as Dean of the Faculty of Applied Science from 1991 to 1996. During 1978 to 1979, he was a visiting scientist at the IBM Research Laboratory in San Jose, California. In 1986, he was a research visitor at the Laboratory for Circuits and Systems associated with the University of Grenoble, France. During 1996 to 1997, he was a visiting professor in the Computer Science Department at the University of California at Riverside and in the LIP6 Laboratory of the University of Paris VI. His research interests are in multiprocessors and multicomputers, focusing on their interconnection networks. Zvonko Vranesic received his B.A.Sc., M.A.Sc., and Ph.D. degrees, all in Electrical Engineering, from the University of Toronto. From 1963 to 1965 he worked as a design engineer with the Northern Electric Co. Ltd. in Bramalea, Ontario. In 1968 he joined the University of Toronto, where he is now a Professor Emeritus in the Department of Electrical & Computer Engineering. During the 1978–79 academic year, he was a Senior Visitor at the University of Cambridge, England, and during 1984-85 he was at the University of Paris, 6. From 1995 to 2000 he served as Chair of the Division of Engineering Science at the University of Toronto. He is also involved in research and development at the Altera Toronto Technology Center. His current research interests include computer architecture and field-programmable VLSI technology. He is a coauthor of four other books: Fundamentals of Digital Logic with VHDL Design, 3rd ed.; Fundamentals of Digital Logic with Verilog Design, 2nd ed.; Microcomputer Structures; and Field-Programmable Gate Arrays. In 1990, he received the Wighton Fellowship for “innovative and distinctive contributions to undergraduate laboratory instruction.” In 2004, he received the Faculty Teaching Award from the Faculty of Applied Science and Engineering at the University of Toronto. Safwat Zaky received his B.Sc. degree in Electrical Engineering and B.Sc. in Mathematics, both from Cairo University, Egypt, and his M.A.Sc. and Ph.D. degrees in Electrical Engineering from the University of Toronto. From 1969 to 1972 he was with Bell Northern Research, Bramalea, Ontario, where he worked on applications of electro-optics and vii

This page intentionally left blank

viii

About the Authors

magnetics in mass storage and telephone switching. In 1973, he joined the University of Toronto, where he is now Professor Emeritus in the Department of Electrical and Computer Engineering. He served as Chair of the Department from 1993 to 2003 and as Vice-Provost from 2003 to 2009. During 1980 to 1981, he was a senior visitor at the Computer Laboratory, University of Cambridge, England. He is a Fellow of the Canadian Academy of Engineering. His research interests are in the areas of computer architecture, digital-circuit design, and electromagnetic compatibility. He is a coauthor of the book Microcomputer Structures and is a recipient of the IEEE Third Millennium Medal and of the Vivek Goel Award for distinguished service to the University of Toronto. Naraig Manjikian received his B.A.Sc. degree in Computer Engineering and M.A.Sc. degree in Electrical Engineering from the University of Waterloo, Canada, and his Ph.D. degree in Electrical Engineering from the University of Toronto. In 1997, he joined Queen’s University, Kingston, Canada, where he is now an Associate Professor in the Department of Electrical and Computer Engineering. From 2004 to 2006, he served as Undergraduate Chair for Computer Engineering. From 2006 to 2007, he served as Acting Head of the Department of Electrical and Computer Engineering, and from 2007 until 2009, he served as Associate Head for Student and Alumni Affairs. During 2003 to 2004, he was a visiting professor at McGill University, Montreal, Canada, and the University of British Columbia. During 2010 to 2011, he was a visiting professor at McGill University. His research interests are in the areas of computer architecture, multiprocessor systems, field-programmable VLSI technology, and applications of parallel processing.

Preface This book is intended for use in a first-level course on computer organization and embedded systems in electrical engineering, computer engineering, and computer science curricula. The book is self-contained, assuming only that the reader has a basic knowledge of computer programming in a high-level language. Many students who study computer organization will have had an introductory course on digital logic circuits. Therefore, this subject is not covered in the main body of the book. However, we have provided an extensive appendix on logic circuits for those students who need it. The book reflects our experience in teaching three distinct groups of students: electrical and computer engineering undergraduates, computer science undergraduates, and engineering science undergraduates. We have always approached the teaching of courses on computer organization from a practical point of view. Thus, a key consideration in shaping the contents of the book has been to carefully explain the main principles, supported by examples drawn from commercially available processors. Our main commercial examples are based on: Altera’s Nios II, Freescale’s ColdFire, ARM, and Intel’s IA-32 architectures. It is important to recognize that digital system design is not a straightforward process of applying optimal design algorithms. Many design decisions are based largely on heuristic judgment and experience. They involve cost/performance and hardware/software tradeoffs over a range of alternatives. It is our goal to convey these notions to the reader. The book is aimed at a one-semester course in engineering or computer science programs. It is suitable for both hardware- and software-oriented students. Even though the emphasis is on hardware, we have addressed a number of relevant software issues. McGraw-Hill maintains a Website with support material for the book at http://www. mhhe.com/hamacher.

Scope of the Book The first three chapters introduce the basic structure of computers, the operations that they perform at the machine-instruction level, and input/output methods as seen by a programmer. The fourth chapter provides an overview of the system software needed to translate programs written in assembly and high-level languages into machine language and to manage their execution. The remaining eight chapters deal with the organization, interconnection, and performance of hardware units in modern computers, including a coverage of embedded systems. Five substantial appendices are provided. The first appendix covers digital logic circuits. Then, four current commercial instruction set architectures—Altera’s Nios II, Freescale’s ColdFire, ARM, and Intel’s IA-32—are described in separate appendices. Chapter 1 provides an overview of computer hardware and informally introduces terms that are discussed in more depth in the remainder of the book. This chapter discusses ix

x

Preface

the basic functional units and the ways they interact to form a complete computer system. Number and character representations are discussed, along with basic arithmetic operations. An introduction to performance issues and a brief treatment of the history of computer development are also provided. Chapter 2 gives a methodical treatment of machine instructions, addressing techniques, and instruction sequencing. Program examples at the machine-instruction level, expressed in a generic assembly language, are used to discuss concepts that include loops, subroutines, and stacks. The concepts are introduced using a RISC-style instruction set architecture. A comparison with CISC-style instruction sets is also included. Chapter 3 presents a programmer’s view of basic input/output techniques. It explains how program-controlled I/O is performed using polling, as well as how interrupts are used in I/O transfers. Chapter 4 considers system software. The tasks performed by compilers, assemblers, linkers, and loaders are explained. Utility programs that trace and display the results of executing a program are described. Operating system routines that manage the execution of user programs and their input/output operations, including the handling of interrupts, are also described. Chapter 5 explores the design of a RISC-style processor. This chapter explains the sequence of processing steps needed to fetch and execute the different types of machine instructions. It then develops the hardware organization needed to implement these processing steps. The differing requirements of CISC-style processors are also considered. Chapter 6 provides coverage of the use of pipelining and multiple execution units in the design of high-performance processors. A pipelined version of the RISC-style processor design from Chapter 5 is used to illustrate pipelining. The role of the compiler and the relationship between pipelined execution and instruction set design are explored. Superscalar processors are discussed. Input/output hardware is considered in Chapter 7. Interconnection networks, including the bus structure, are discussed. Synchronous and asynchronous operation is explained. Interconnection standards, including USB and PCI Express, are also presented. Semiconductor memories, including SDRAM, Rambus, and Flash memory implementations, are discussed in Chapter 8. Caches are explained as a way for increasing the memory bandwidth. They are discussed in some detail, including performance modeling. Virtual-memory systems, memory management, and rapid address-translation techniques are also presented. Magnetic and optical disks are discussed as components in the memory hierarchy. Chapter 9 explores the implementation of the arithmetic unit of a computer. Logic design for fixed-point add, subtract, multiply, and divide hardware, operating on 2’scomplement numbers, is described. Carry-lookahead adders and high-speed multipliers are explained, including descriptions of the Booth multiplier recoding and carry-save addition techniques. Floating-point number representation and operations, in the context of the IEEE Standard, are presented. Today, far more processors are in use in embedded systems than in general-purpose computers. Chapters 10 and 11 are dedicated to the subject of embedded systems. First, basic aspects of system integration, component interconnections, and real-time operation are presented in Chapter 10. The use of microcontrollers is discussed. Then, Chapter 11 concentrates on system-on-a-chip (SoC) implementations, in which a single chip integrates

Preface

the processing, memory, I/O, and timer functionality needed to satisfy application-specific requirements. A substantial example shows how FPGAs and modern design tools can be used in this environment. Chapter 12 focuses on parallel processing and performance. Hardware multithreading and vector processing are introduced as enhancements in a single processor. Sharedmemory multiprocessors are then described, along with the issue of cache coherence. Interconnection networks for multiprocessors are presented. Appendix A provides extensive coverage of logic circuits, intended for a reader who has not taken a course on the design of such circuits. Appendices B, C, D, and E illustrate how the instruction set concepts introduced in Chapters 2 and 3 are implemented in four commercial processors: Nios II, ColdFire, ARM, and Intel IA-32. The Nios II and ARM processors illustrate the RISC design style. ColdFire has an easy-to-teach CISC design, while the IA-32 CISC architecture represents the most successful commercial design. The presentation for each processor includes assemblylanguage examples from Chapters 2 and 3, implemented in the context of that processor. The details given in these appendices are not essential for understanding the material in the main body of the book. It is sufficient to cover only one of these appendices to gain an appreciation for commercial processor instruction sets. The choice of a processor to use as an example is likely to be influenced by the equipment in an accompanying laboratory. Instructors may wish to use more that one processor to illustrate the different design approaches.

Changes in the Sixth Edition Substantial changes in content and organization have been made in preparing the sixth edition of this book. They include the following: •





The basic concepts of instruction set architecture are now covered using the RISC-style approach. This is followed by a comparative examination of the CISC-style approach. The processor design discussion is focused on a RISC-style implementation, which leads naturally to pipelined operation. Two chapters on embedded systems are included: one dealing with the basic structure of such systems and the use of microcontrollers, and the other dealing with system-ona-chip implementations.



Appendices are used to give examples of four commercial processors. Each appendix includes the essential information about the instruction set architecture of the given processor.



Solved problems have been included in a new section toward the end of chapters and appendices. They provide the student with solutions that can be expected for typical problems.

Difficulty Level of Problems The problems at the end of chapters and appendices have been classified as easy (E), medium (M), or difficult (D). These classifications should be interpreted as follows:

xi

xii

Preface



Easy—Solutions can be derived in a few minutes by direct application of specific information presented in one place in the relevant section of the book.



Medium—Use of the book material in a way that does not directly follow any examples presented is usually needed. In some cases, solutions may follow the general pattern of an example, but will take longer to develop than those for easy problems.



Difficult—Some additional insight is needed to solve these problems. If a solution requires a program to be written, its underlying algorithm or form may be quite different from that of any program example given in the book. If a hardware design is required, it may involve an arrangement and interconnection of basic logic circuit components that is quite different from any design shown in the book. If a performance analysis is needed, it may involve the derivation of an algebraic expression.

What Can Be Covered in a One-Semester Course This book is suitable for use at the university or college level as a text for a one-semester course in computer organization. It is intended for the first course that students will take on computer organization. There is more than enough material in the book for a one-semester course. The core material on computer organization and relevant software issues is given in Chapters 1 through 9. For students who have not had a course in logic circuits, the material in Appendix A should be studied near the beginning of a course and certainly prior to covering Chapter 5. A course aimed at embedded systems should include Chapters 1, 2, 3, 4, 7, 8, 10 and 11. Use of the material on commercial processor examples in Appendices B through E can be guided by instructor and student interest, as well as by relevance to any hardware laboratory associated with a course.

Acknowledgments We wish to express our thanks to many people who have helped us during the preparation of this sixth edition of the book. Our colleagues Daniel Etiemble of University of Paris South and Glenn Gulak of University of Toronto provided numerous comments and suggestions that helped significantly in shaping the material. Blair Fort and Dan Vranesic provided valuable help with some of the programming examples. Warren R. Carithers of Rochester Institute of Technology, Krishna M. Kavi of University of North Texas, and Nelson Luiz Passos of Midwestern State University provided reviews of material from both the fifth and sixth editions of the book. The following people provided reviews of material from the fifth edition of the book: Goh Hock Ann of Multimedia University, Joseph E. Beaini of University of Colorado Denver, Kalyan Mohan Goli of Jawaharlal Nehru Technological University, Jaimon Jacob of Model Engineering College Ernakulam, M. Kumaresan of Anna University Coimbatore,

Preface

Kenneth K. C. Lee of City University of Hong Kong, Manoj Kumar Mishra of Institute of Technical Education and Research, Junita Mohamad-Saleh of Universiti Sains Malaysia, Prashanta Kumar Patra of College of Engineering and Technology Bhubaneswar, ShanqJang Ruan of National Taiwan University of Science and Technology, S. D. Samantaray of G. B. Pant University of Agriculture and Technology, Shivakumar Sastry of University of Akron, Donatella Sciuto of Politecnico of Milano, M. P. Singh of National Institute of Technology Patna, Albert Starling of University of Arkansas, Shannon Tauro of University of California Irvine, R. Thangarajan of Kongu Engineering College, Ashok Kunar Turuk of National Institute of Technology Rourkela, and Philip A. Wilsey of University of Cincinnati. Finally, we truly appreciate the support of Raghothaman Srinivasan, Peter E. Massar, Darlene M. Schueller, Lisa Bruflodt, Curt Reynolds, Brenda Rolwes, and Laura Fuller at McGraw-Hill. Carl Hamacher Zvonko Vranesic Safwat Zaky Naraig Manjikian

xiii

McGraw-Hill CreateTM Craft your teaching resources to match the way you teach! With McGraw-Hill Create, www.mcgrawhillcreate.com, you can easily rearrange chapters, combine material from other content sources, and quickly upload content you have written like your course syllabus or teaching notes. Find the content you need in Create by searching through thousands of leading McGraw-Hill textbooks. Arrange your book to fit your teaching style. Create even allows you to personalize your book’s appearance by selecting the cover and adding your name, school, and course information. Order a Create book and you’ll receive a complimentary print review copy in 3-5 business days or a complimentary electronic review copy (eComp) via email in minutes. Go to www.mcgrawhillcreate.com today and register to experience how McGraw-Hill Create empowers you to teach your students your way.

McGraw-Hill Higher Education and Blackboard® have teamed up. Blackboard, the Web-based course management system, has partnered with McGraw-Hill to better allow students and faculty to use online materials and activities to complement face-to-face teaching. Blackboard features exciting social learning and teaching tools that foster more logical, visually impactful and active learning opportunities for students. You’ll transform your closed-door classrooms into communities where students remain connected to their educational experience 24 hours a day. This partnership allows you and your students access to McGraw-Hill’s Create right from within your Blackboard course - all with one single sign-on. McGraw-Hill and Blackboard can now offer you easy access to industry leading technology and content, whether your campus hosts it, or we do. Be sure to ask your local McGraw-Hill representative for details.

Contents Chapter

2.3

1

2.3.1 2.3.2 2.3.3 2.3.4

Basic Structure of Computers 1 1.1 1.2

Computer Types 2 Functional Units 3 1.2.1 1.2.2 1.2.3 1.2.4 1.2.5

1.3 1.4

1.5 1.6

2.4

Integers 10 Floating-Point Numbers

2.4.2 2.4.3

2.5

16

2.5.3

2.6 2.7

The First Generation 20 The Second Generation 20 The Third Generation 21 The Fourth Generation 21

2.7.2 2.7.3

2.8

Chapter

2.1.1 2.1.2 2.1.3 2.1.4

2.2

65

Logic Instructions 67 Shift and Rotate Instructions 68 Multiplication and Division 71

2.9 Dealing with 32-Bit Immediate Values 2.10 CISC Instruction Sets 74

2

Memory Locations and Addresses

56

Subroutine Nesting and the Processor Stack 58 Parameter Passing 59 The Stack Frame 63

Additional Instructions 2.8.1 2.8.2 2.8.3

2.10.1 2.10.2

Instruction Set Architecture 27 2.1

Assembler Directives 50 Assembly and Execution of Programs 53 Number Notation 54

Stacks 55 Subroutines 2.7.1

Concluding Remarks 22 Solved Problems 22 Problems 24 References 25

40

Implementation of Variables and Constants 41 Indirection and Pointers 42 Indexing and Arrays 45

Assembly Language 48 2.5.1 2.5.2

Technology 17 Parallelism 19

Register Transfer Notation 33 Assembly-Language Notation 33 RISC and CISC Instruction Sets 34 Introduction to RISC Instruction Sets 34 Instruction Execution and Straight-Line Sequencing 36 Branching 37 Generating Memory Addresses 40

Addressing Modes 2.4.1

Historical Perspective 19 1.7.1 1.7.2 1.7.3 1.7.4

1.8 1.9

2.3.6 2.3.7

5

Character Representation 17 Performance 17 1.6.1 1.6.2

1.7

2.3.5

Input Unit 4 Memory Unit 4 Arithmetic and Logic Unit Output Unit 6 Control Unit 6

Basic Operational Concepts 7 Number Representation and Arithmetic Operations 9 1.4.1 1.4.2

Instructions and Instruction Sequencing 32

Additional Addressing Modes 75 Condition Codes 77

2.11 RISC and CISC Styles 78 2.12 Example Programs 79

28

2.12.1 2.12.2

Byte Addressability 30 Big-Endian and Little-Endian Assignments 30 Word Alignment 31 Accessing Numbers and Characters 32

Memory Operations 32 xv

Vector Dot Product Program String Search Program 81

79

2.13 Encoding of Machine Instructions 2.14 Concluding Remarks 85 2.15 Solved Problems 85 Problems 90

82

73

xvi

Contents

Chapter

Chapter

3

Basic Input/Output 3.1

Accessing I/O Devices 3.1.1 3.1.2 3.1.3 3.1.4

3.2

96

I/O Device Interface 97 Program-Controlled I/O 97 An Example of a RISC-Style I/O Program 101 An Example of a CISC-Style I/O Program 101

152

5.8 5.9

130

Two-pass Assembler 131

131

Compiler Optimizations 134 Combining Programs Written in Different Languages 134

The Debugger 134 Using a High-level Language for I/O Tasks 137 Interaction between Assembly Language and C Language 139 The Operating System 143

4.9.3

The Boot-strapping Process 144 Managing the Execution of Application Programs 144 Use of Interrupts in Operating Systems 146

4.10 Concluding Remarks 149 Problems 149 References 150

Register File 158 ALU 160 Datapath 161 Instruction Fetch Section Branching 168 Waiting for Memory

178

Concluding Remarks 185 Solved Problems 185 Problems 188 6

193

Basic Concept—The Ideal Case Pipeline Organization 195 Pipelining Issues 196 Data Dependencies 197 6.4.1 6.4.2

194

Operand Forwarding 198 Handling Data Dependencies in Software 199

Memory Delays 201 Branch Delays 202 6.6.1 6.6.2 6.6.3 6.6.4

6.7 6.8

171

An Interconnect using Buses 180 Microprogrammed Control 183

Pipelining

6.5 6.6

165

Datapath Control Signals 177 Dealing with Memory Delay 177

CISC-Style Processors

Chapter

6.1 6.2 6.3 6.4

164

Control Signals 172 Hardwired Control 175

5.7.1 5.7.2

Loading and Executing Object Programs The Linker 132 Libraries 133 The Compiler 133

4.9.1 4.9.2

5.5 5.6

156

158

Instruction Fetch and Execution Steps 5.4.1 5.4.2

129

4.2 4.3 4.4 4.5

4.5.1 4.5.2

5.4

Load Instructions 155 Arithmetic and Logic Instructions Store Instructions 157

Hardware Components 5.3.1 5.3.2 5.3.3 5.3.4

5.7

The Assembly Process 4.1.1

Some Fundamental Concepts Instruction Execution 155

5.6.1 5.6.2

4

4.1

4.9

5.1 5.2

5.3

Concluding Remarks 119 Solved Problems 119 Problems 126

Software

4.8

151

5.2.1 5.2.2 5.2.3

Enabling and Disabling Interrupts 106 Handling Multiple Devices 107 Controlling I/O Device Behavior 109 Processor Control Registers 110 Examples of Interrupt Programs 111 Exceptions 116

Chapter

4.6 4.7

Basic Processing Unit

Interrupts 103 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 3.2.6

3.3 3.4

95

5

Unconditional Branches 202 Conditional Branches 204 The Branch Delay Slot 204 Branch Prediction 205

Resource Limitations 209 Performance Evaluation 209 6.8.1 6.8.2

Effects of Stalls and Penalties 210 Number of Pipeline Stages 212

xvii

Contents

6.9

Superscalar Operation 6.9.1 6.9.2 6.9.3 6.9.4

212

Branches and Data Dependencies Out-of-Order Execution 215 Execution Completion 216 Dispatch Operation 217

6.10 Pipelining in CISC Processors 6.10.1 6.10.2

7

Synchronous Bus 230 Asynchronous Bus 233 Electrical Considerations

236

247

Universal Serial Bus (USB) FireWire 251 PCI Bus 252 SCSI Bus 256 SATA 258 SAS 258 PCI Express 258

247

The Memory System

Basic Concepts 268 Semiconductor RAM Memories 270 8.2.1 8.2.2 8.2.3

Internal Organization of Memory Chips 270 Static Memories 271 Dynamic RAMs 274

297

300

Hit Rate and Miss Penalty 301 Caches on the Processor Chip 302 Other Enhancements 303

8.8.1

Address Translation

306

8.9 Memory Management Requirements 8.10 Secondary Storage 311

310

Magnetic Hard Disks 311 Optical Disks 317 Magnetic Tape Systems 322

8.11 Concluding Remarks 323 8.12 Solved Problems 324 Problems 328 References 332 Chapter

9

Arithmetic

335

9.1

Addition and Subtraction of Signed Numbers 336

9.2

Design of Fast Adders

9.3

Multiplication of Unsigned Numbers

9.2.1

267

Mapping Functions 291 Replacement Algorithms 296 Examples of Mapping Techniques

Virtual Memory 305

9.1.1

8

282

ROM 283 PROM 283 EPROM 284 EEPROM 284 Flash Memory 284

Performance Considerations

8.10.1 8.10.2 8.10.3

Concluding Remarks 260 Solved Problems 260 Problems 263 References 266

Chapter

8.1 8.2

8.8

Parallel Interface 239 Serial Interface 243

Interconnection Standards 7.5.1 7.5.2 7.5.3 7.5.4 7.5.5 7.5.6 7.5.7

7.6 7.7

8.7

8.7.1 8.7.2 8.7.3

Arbitration 237 Interface Circuits 238 7.4.1 7.4.2

7.5

227

279

Direct Memory Access 285 Memory Hierarchy 288 Cache Memories 289 8.6.1 8.6.2 8.6.3

Bus Structure 228 Bus Operation 229 7.2.1 7.2.2 7.2.3

7.3 7.4

8.4 8.5 8.6

220

Synchronous DRAMs 276 Structure of Larger Memories

Read-only Memories 8.3.1 8.3.2 8.3.3 8.3.4 8.3.5

218

Input/Output Organization 7.1 7.2

8.3

Pipelining in ColdFire Processors 219 Pipelining in Intel Processors 219

6.11 Concluding Remarks 220 6.12 Examples of Solved Problems Problems 222 References 226 Chapter

8.2.4 8.2.5

214

9.3.1 9.3.2

Addition/Subtraction Logic Unit 336

339

Carry-Lookahead Addition 340

9.4

Multiplication of Signed Numbers

9.5

Fast Multiplication 351

9.4.1 9.5.1 9.5.2

344

Array Multiplier 344 Sequential Circuit Multiplier 346 The Booth Algorithm

346

348

Bit-Pair Recoding of Multipliers 352 Carry-Save Addition of Summands 353

xviii

Contents 9.5.3 9.5.4 9.5.5

9.6 9.7

Summand Addition Tree using 3-2 Reducers 355 Summand Addition Tree using 4-2 Reducers 357 Summary of Fast Multiplication 359

Integer Division 360 Floating-Point Numbers and Operations 9.7.1 9.7.2 9.7.3

Arithmetic Operations on Floating-Point Numbers 367 Guard Bits and Truncation 368 Implementing Floating-Point Operations 369

9.8 Decimal-to-Binary Conversion 9.9 Concluding Remarks 372 9.10 Solved Problems 374 Problems 377 References 383 Chapter

372

386

Microwave Oven 386 Digital Camera 387 Home Telemetry 390

Parallel I/O Interface 392 Serial I/O Interface 395 Counter/Timer 397 Interrupt-Control Mechanism 399 Programming Examples 399

10.4 Reaction Timer—A Complete Example 401 10.5 Sensors and Actuators 407 Sensors 407 Actuators 410 Application Examples

411

10.6 Microcontroller Families 412 10.6.1 10.6.2 10.6.3

Microcontrollers Based on the Intel 8051 413 Freescale Microcontrollers 413 ARM Microcontrollers 414

10.7 Design Issues 414 10.8 Concluding Remarks 417 Problems 418 References 420

11.1 FPGA Implementation 422 11.1.1 11.1.2

FPGA Devices 423 Processor Choice 423

11.2 Computer-Aided Design Tools 424 11.2.1

Altera CAD Tools

425

11.3 Alarm Clock Example 428 11.3.1 11.3.2 11.3.3 11.3.4

User’s View of the System 428 System Definition and Generation Circuit Implementation 430 Application Software 431

429

440

12

Parallel Processing and Performance 443 12.1 Hardware Multithreading 444 12.2 Vector (SIMD) Processing 445

10.2 Microcontroller Chips for Embedded Applications 390 10.3 A Simple Microcontroller 392

10.5.1 10.5.2 10.5.3

System-on-a-Chip—A Case Study 421

Chapter

385

10.1 Examples of Embedded Systems

10.3.1 10.3.2 10.3.3 10.3.4 10.3.5

11

11.4 Concluding Remarks Problems 440 References 441

10

Embedded Systems 10.1.1 10.1.2 10.1.3

363

Chapter

12.2.1

Graphics Processing Units (GPUs)

12.3 Shared-Memory Multiprocessors 12.3.1

448

448

Interconnection Networks 450

12.4 Cache Coherence 453 12.4.1 12.4.2 12.4.3 12.4.4

Write-Through Protocol 453 Write-Back protocol 454 Snoopy Caches 454 Directory-Based Cache Coherence

12.5 Message-Passing Multicomputers 12.6 Parallel Programming for Multiprocessors 456 12.7 Performance Modeling 460 12.8 Concluding Remarks 461 Problems 462 References 463 Appendix

A

Logic Circuits 465 A.1 Basic Logic Functions A.1.1

Electronic Logic Gates

469

A.2 Synthesis of Logic Functions

470

456

456

Contents

A.3 Minimization of Logic Expressions A.3.1 A.3.2

472

Minimization using Karnaugh Maps Don’t-Care Conditions 477

B.4.4 B.4.5 B.4.6 B.4.7 B.4.8 B.4.9 B.4.10 B.4.11

475

A.4 Synthesis with NAND and NOR Gates 479 A.5 Practical Implementation of Logic Gates 482 A.5.1 A.5.2 A.5.3 A.5.4

CMOS Circuits 484 Propagation Delay 489 Fan-In and Fan-Out Constraints 490 Tri-State Buffers 491

A.6 Flip-Flops A.6.1 A.6.2 A.6.3 A.6.4 A.6.5 A.6.6

A.7 A.8 A.9 A.10 A.11

492

Gated Latches 493 Master-Slave Flip-Flop 495 Edge Triggering 498 T Flip-Flop 498 JK Flip-Flop 499 Flip-Flops with Preset and Clear

501

Logic Instructions 537 Move Instructions 537 Branch and Jump Instructions 538 Subroutine Linkage Instructions 541 Comparison Instructions 545 Shift Instructions 546 Rotate Instructions 547 Control Instructions 548

Pseudoinstructions 548 Assembler Directives 549 Carry and Overflow Detection 551 Example Programs 553 Control Registers 553 Input/Output 555 B.10.1 B.10.2

Program-Controlled I/O 556 Interrupts and Exceptions 556

Registers and Shift Registers 502 Counters 503 Decoders 505 Multiplexers 506 Programmable Logic Devices (PLDs) 509

B.11 Advanced Configurations of Nios II Processor 562

A.11.1 A.11.2 A.11.3

B.12 Concluding Remarks 563 B.13 Solved Problems 563 Problems 568

Programmable Logic Array (PLA) 509 Programmable Array Logic (PAL) 511 Complex Programmable Logic Devices (CPLDs) 512

A.12 Field-Programmable Gate Arrays 514 A.13 Sequential Circuits 516 A.13.1 A.13.2 A.13.3 A.13.4

Design of an Up/Down Counter as a Sequential Circuit 516 Timing Diagrams 519 The Finite State Machine Model 520 Synthesis of Finite State Machines 521

A.14 Concluding Remarks 522 Problems 522 References 528 Appendix

B

The Altera Nios II Processor 529 B.1 B.2 B.3 B.4

B.5 B.6 B.7 B.8 B.9 B.10

Nios II Characteristics 530 General-Purpose Registers 531 Addressing Modes 532 Instructions 533 B.4.1 B.4.2 B.4.3

Notation 533 Load and Store Instructions 534 Arithmetic Instructions 536

xix

B.11.1 B.11.2 B.11.3

External Interrupt Controller 562 Memory Management Unit 562 Floating-Point Hardware 562

Appendix

C

The ColdFire Processor 571 C.1 Memory Organization 572 C.2 Registers 572 C.3 Instructions 573 C.3.1 C.3.2 C.3.3 C.3.4 C.3.5 C.3.6 C.3.7

Addressing Modes 575 Move Instruction 577 Arithmetic Instructions 578 Branch and Jump Instructions 582 Logic Instructions 585 Shift Instructions 586 Subroutine Linkage Instructions 587

C.4 Assembler Directives 593 C.5 Example Programs 594 C.5.1 C.5.2

Vector Dot Product Program 594 String Search Program 595

C.6 Mode of Operation and Other Control Features 596 C.7 Input/Output 597 C.8 Floating-Point Operations 599 C.8.1

FMOVE Instruction 599

xx

Contents C.8.2 C.8.3 C.8.4 C.8.5

Floating-Point Arithmetic Instructions 600 Comparison and Branch Instructions 601 Additional Floating-Point Instructions 601 Example Floating-Point Program 602

C.9 Concluding Remarks 603 C.10 Solved Problems 603 Problems 608 References 609 Appendix

Appendix

661 611

D.1 ARM Characteristics 612 Unusual Aspects of the ARM Architecture 612

E.1 Memory Organization 662 E.2 Register Structure 662 E.3 Addressing Modes 665 E.4 Instructions 668 E.4.1 E.4.2 E.4.3 E.4.4 E.4.5 E.4.6 E.4.7 E.4.8 E.4.9 E.4.10

D.2 Register Structure 613 D.3 Addressing Modes 614 D.3.1 D.3.2 D.3.3 D.3.4 D.3.5 D.3.6

Basic Indexed Addressing Mode 614 Relative Addressing Mode 615 Index Modes with Writeback 616 Offset Determination 616 Register, Immediate, and Absolute Addressing Modes 618 Addressing Mode Examples 618

D.4 Instructions 621 D.4.1 D.4.2 D.4.3 D.4.4 D.4.5 D.4.6 D.4.7 D.4.8

Load and Store Instructions 621 Arithmetic Instructions 622 Move Instructions 625 Logic and Test Instructions 626 Compare Instructions 627 Setting Condition Code Flags 628 Branch Instructions 628 Subroutine Linkage Instructions 631

D.5 Assembly Language D.5.1 D.6.1 D.6.2

635

Pseudoinstructions 637

D.6 Example Programs

638

Vector Dot Product 639 String Search 639

D.7 Operating Modes and Exceptions 639 D.7.1 D.7.2 D.7.3 D.7.4

Banked Registers 641 Exception Types 642 System Mode 644 Handling Exceptions 644

D.8 Input/Output 646 D.8.1 D.8.2

E

The Intel IA-32 Architecture

D

The ARM Processor D.1.1

D.9 Conditional Execution of Instructions 648 D.10 Coprocessors 650 D.11 Embedded Applications and the Thumb ISA 651 D.12 Concluding Remarks 651 D.13 Solved Problems 652 Problems 657 References 660

Program-Controlled I/O 646 Interrupt-Driven I/O 648

Machine Instruction Format 670 Assembly-Language Notation 670 Move Instruction 671 Load-Effective-Address Instruction 671 Arithmetic Instructions 672 Jump and Loop Instructions 674 Logic Instructions 677 Shift and Rotate Instructions 678 Subroutine Linkage Instructions 679 Operations on Large Numbers 681

E.5 Assembler Directives 685 E.6 Example Programs 686 E.6.1 E.6.2

E.7 E.8 E.9

Vector Dot Product Program 686 String Search Program 686

Interrupts and Exceptions 687 Input/Output Examples 689 Scalar Floating-Point Operations E.9.1 E.9.2 E.9.3 E.9.4 E.9.5

690

Load and Store Instructions 692 Arithmetic Instructions 693 Comparison Instructions 694 Additional Instructions 694 Example Floating-Point Program 694

E.10 Multimedia Extension (MMX) Operations 695 E.11 Vector (SIMD) Floating-Point Operations 696 E.12 Examples of Solved Problems 697 E.13 Concluding Remarks 702 Problems 702 References 703