"People who are really serious about software should make their own hardware"
-- Alan Kay
Overview
In this course, we will cover how modern processors are designed to achieve high performance under which restrictions. We will cover the topics related to: instruction set design; processor micro-architecture and pipelining; cache and virtual memory organizations; protection and sharing; I/O and interrupts; in-order and out-of-order superscalar architectures; memory models and synchronization; embedded systems; and parallel computers.
You will actually get hands-on experience with hardware design using a sequence of gently guided labs, and get to see your ideas actually improve performance on real metal! We will use RISC-V, a modern, real-world, and open-source ISA as our learning tool.
Lecturer: Sang-Woo Jun
Lectures: TuTh 8PM - 9:20PM @ EH 1200
Discussion: Fri 7PM @ DBH 1100
Piazza: Link provided in Canvas
Mid-terms: November 1st, usual lecture hall!
Schedule And Material
2022-09-22 |
Lecture 1: Introduction, Moore's Law
|
2022-09-27 |
Lecture 2: Hardware-Software Interface
- Slides - TBA
- Slides - HW-SW Interface
- Slides - RISC-V and x86
- The Digital Antiquarian, "Doing Windows" Series [link]: A fascinating history of how Microsoft Windows became what others couldn't, and how innovations in Intel 80386 helped made it happen.
- LGR, Installing MS-DOS on an AMD Ryzen Gaming PC [link]: Testament to the strict backwards-compatibility of x86!
|
2022-09-29 |
Lecture 3: ISA Encoding and Complexity
- Slides - ISA Encoding
- Swanson Technologies, The Art of Picking Intel Registers [link]: A bit more detail into how no x86 register is quite "general-purpose"
|
2022-10-04 |
Lecture 4: Digital Circuits Why and How
|
2022-10-06 |
Lecture 5: Pipelining
|
2022-10-11, 2022-10-13 |
Lecture 6: Fast and Correct Pipelining
|
2022-10-18 |
Lecture 6: Fast and Correct Pipelining
|
2022-10-20 |
Lecture 7: Caches and the memory system
|
2022-10-27 |
Lecture 8: Architectural support for the Operating System
|
2022-11-01 |
Lecture 9: Virtual Memory
|
2022-11-10 |
Lecture 10: Multiprocessing
- Slides
- (Slightly) asymmetrical multiprocessing with ARM big.LITTLE architecture: link
- Understanding memory models is important for fast kernels (email from Linus Torvalds) link
|
2022-11-17 |
Lecture 11: GPU Introduction
|
2022-11-19 |
Lecture 12: Application-specific accelerators
|
2022-12-01 |
Lecture 13: Datacenter architecture
|
2022-12-01 |
Lecture 14: Bonus: FPGAs
- Slides
- A really nice video from Intel introducing FPGAs [link]
|
Homework
- Homework 1: Take-home quiz [link]
- Homework 2: Architecture-aware software performance engineering [link]
- Homework 3: Take-home quiz, also will act as practice test for finals [link]
- Optional work for additional credit: Single-page essay. Additional credit based on quality of submission, up to a letter grade improvement. Due Dec. 12
Pick one of two topics, and submit an essay
- Prompt 1: The Compute Express Link (CXL) is expected to massively impact future datacenter architectures. One expected use case of CXL is disaggregated memory. Given the performance and cost profile of CXL-based memory, how will popular applications like databases be affected by this technology? Will it be a primarily cost-improvement solution, performance-improvement solution, privacy-improvement solution, or all of the above?
- Prompt 2: Intel Software Guarded Extension (SGX) is an ISA extension which aims to improve privacy. Describe its usage. How does it protect against the popular side-channel attacks like Spectre, and will it continue to be safe in the foreseeable future?
Videos
Grading
Homework: 50%, midterm exam: 25%, final exam: 25% of your grade (all grades curved).
Late homework policy: You can submit late homework 5 days after the deadline for 60% of your grade.
Book
This class does not have a mandatory book.
However, it may be helpful to consult
Computer Organization and Design RISC-V Edition: The Hardware Software Interface (by David A. Patterson and John L. Hennessy).
Exam resources
- 2020 CS152 mid-terms link