Prof. Vivek Sarkar, DH 3131
Dr. Eric Allen
Prasanth Chatarasi, Peng Du, Xian Fan, Max Grossman
|Please send all emails to comp322-staff at rice dot edu||Undergraduate TAs:||Matthew Bernhard, Nicholas Hanson-Holtry, Yi Hua,|
Yoko Li, Ayush Narayan, Derek Peirce,
Maggie Tang, Wei Zeng, Glenn Zhu
Vincent Cavé, John Greiner, Shams Imam
Herzstein Hall 210
MWF 1:00pm - 1:50pm
DH 1064 (Section A01), DH 1070 (Section A02)
Wednesday, 07:00pm - 08:30pm
A summary PDF file containing the course syllabus for the course can be found here . Much of the syllabus information is also included below in this course web site, along with some additional details that are not included in the syllabus.
The primary goal of COMP 322 is to introduce you to the fundamentals of parallel programming and parallel algorithms, by following a pedagogic approach that exposes you to the intellectual challenges in parallel software without enmeshing you in the jargon and lower-level details of today's parallel systems. A strong grasp of the course fundamentals will enable you to quickly pick up any specific parallel programming system that you may encounter in the future, and also prepare you for studying advanced topics related to parallelism and concurrency in courses such as COMP 422.
The desired learning outcomes fall into three major areas (course modules):
1) Parallelism: creation and coordination of parallelism (async, finish), abstract performance metrics (work, critical paths), Amdahl's Law, weak vs. strong scaling, data races and determinism, data race avoidance (immutability, futures, accumulators, dataflow), deadlock avoidance, abstract vs. real performance (granularity, scalability), collective & point-to-point synchronization (phasers, barriers), parallel algorithms, systolic algorithms.
2) Concurrency: critical sections, atomicity, isolation, high level data races, nondeterminism, linearizability, liveness/progress guarantees, actors, request-response parallelism, Java Concurrency, locks, condition variables, semaphores, memory consistency models.
3) Locality & Distribution: memory hierarchies, locality, cache affinity, data movement, message-passing (MPI), communication overheads (bandwidth, latency), MapReduce, accelerators, GPGPUs, CUDA, OpenCL.
To achieve these learning outcomes, each class period will include time for both instructor lectures and in-class exercises based on assigned reading and videos. The lab exercises will be used to help students gain hands-on programming experience with the concepts introduced in the lectures.
To ensure that students gain a strong knowledge of parallel programming foundations, the classes and homeworks will place equal emphasis on both theory and practice. The programming component of the course will mostly use the Habanero-Java Library (HJ-lib) pedagogic extension to the Java language developed in the Habanero Extreme Scale Software Research project at Rice University. The course will also introduce you to real-world parallel programming models including Java Concurrency, MapReduce, MPI, OpenCL and CUDA. An important goal is that, at the end of COMP 322, you should feel comfortable programming in any parallel language for which you are familiar with the underlying sequential language (Java or C). Any parallel programming primitives that you encounter in the future should be easily recognizable based on the fundamentals studied in COMP 322.
The prerequisite course requirements are COMP 182 and COMP 215. COMP 322 should be accessible to anyone familiar with the foundations of sequential algorithms and data structures, and with basic Java programming. COMP 321 is also recommended as a co-requisite.
There are no required textbooks for the class. Instead, lecture handouts are provided for each module as follows. The links to the latest versions on Owlspace are included below:
You are expected to read the relevant sections in each lecture handout before coming to the lecture. We will also provide a number of references in the slides and handouts.
There are also a few optional textbooks that we will draw from quite heavily. You are encouraged to get copies of any or all of these books. They will serve as useful references both during and after this course:
Assigned Videos (Quizzes due by Friday of each week)
Lecture 1: The What and Why of Parallel Programming, Task Creation and Termination (Async, Finish)
|Module 1: Sections 0.1, 0.2, 1.1||worksheet1||lec1-slides|
Lecture 2: Computation Graphs, Ideal Parallelism
|Module 1: Sections 1.2, 1.3||Topic 1.2 Lecture, Topic 1.2 Demonstration, Topic 1.3 Lecture, Topic 1.3 Demonstration||worksheet2||lec2-slides|
|Fri||Jan 16||Lecture 3: , Abstract Performance Metrics, Multiprocessor Scheduling||Module 1: Section 1.4||Topic 1.4 Lecture, Topic 1.4 Demonstration||worksheet3||lec3-slides||Homework 1||Lecture & demo quizzes for topics 1.1, 1.2, 1.3, 1.4|
No lecture, School Holiday (Martin Luther King, Jr. Day)
Lecture 4: Parallel Speedup and Amdahl's Law
|Module 1: Section 1.5||Topic 1.5 Lecture, Topic 1.5 Demonstration||worksheet4||lec4-slides|
Lecture 5: Future Tasks, Functional Parallelism
|Module 1: Section 1.6 (self-study), Section 2.1||Topic 1.6 Lecture, Topic 1.6 Demonstration, Topic 2.1 Lecture, Topic 2.1 Demonstration||worksheet5||lec5-slides||Lecture & demo quizzes for topics 1.5, 1.6, 2.1|
Lecture 6: Finish Accumulators
|Module 1: Section 2.3||Topic 2.3 Lecture , Topic 2.3 Demonstration||worksheet6||lec6-slides|
Lecture 7: Data Races, Functional & Structural Determinism
|Module 1: Sections 2.5, 2.6||Topic 2.5 Lecture , Topic 2.5 Demonstration, Topic 2.6 Lecture , Topic 2.6 Demonstration||worksheet7||lec7-slides||Homework 2||Homework 1|
Lecture 8: Map Reduce
|Module 1: Section 2.4||Topic 2.4 Lecture , Topic 2.4 Demonstration||worksheet8||lec8-slides||Lecture & demo quizzes for topics 2.3, 2.4, 2.5, 2.6|
Lecture 9: Memoization
|Module 1: Section 2.2||Topic 2.2 Lecture , Topic 2.2 Demonstration||worksheet9||lec9-slides|
Lecture 10: Loop-Level Parallelism, Parallel Matrix Multiplication, Iteration Grouping (Chunking)
|Module 1: Sections 3.1, 3.2, 3.3||worksheet10||lec10-slides|
Lecture 11: Barrier Synchronization
|Module 1: Section 3.4||Topic 3.4 Lecture , Topic 3.4 Demonstration||worksheet11||lec11-slides||Lecture & demo quizzes for topics 2.2, 3.1, 3.2, 3.3, 3.4|
Lecture 12: Iterative Averaging Revisited, SPMD pattern
|Module 1: Sections 3.5, 3.6||Topic 3.5 Lecture, Topic 3.5 Demonstration, Topic 3.6 Lecture, Topic 3.6 Demonstration||worksheet12||lec12-slides|
Lecture 13: Java’s ForkJoin Library
Lecture 14: Data-Driven Tasks and Data-Driven Futures
|Module 1: Section 4.5||Topic 4.5 Lecture, Topic 4.5 Demonstration||worksheet14||lec14-slides||Lecture & demo quizzes for topics 3.5, 3.6, 4.5|
Lecture 15: Abstract vs. Real Performance
Lecture 16: Phasers, Point-to-point Synchronization
|Module 1: Sections 4.2, 4.3||Topic 4.2 Lecture, Topic 4.2 Demonstration, Topic 4.3 Lecture, Topic 4.3 Demonstration||worksheet16||lec16-slides|
Lecture 17: Pipeline Parallelism, Signal Statement, Fuzzy Barriers
|Module 1: Sections 4.4, 4.1||Topic 4.4 Lecture, Topic 4.4 Demonstration, Topic 4.1 Lecture, Topic 4.1 Demonstration,||worksheet17||lec17-slides||Lecture & demo quizzes for topics 4.1, 4.2, 4.3, 4.4|
Lecture 18: Classification of Parallel Programs
|Topic 4.6 Lecture, Topic 4.6 Demonstration||worksheet18||lec18-slides|
Lecture 19: Midterm Summary, Take-home Exam 1 distributed
No Lecture (Exam 1 due by 4pm today)
|Lecture & demo quizzes for topic 4.6, Exam 1|
Feb 28- Mar 08
Lecture 20: Critical sections, Isolated construct, Parallel Spanning Tree algorithm
|Module 1: Sections 3.5, 3.6||Topic 5.1 Lecture, Topic 5.1 Demonstration, Topic 5.2 Lecture, Topic 5.2 Demonstration, Topic 5.3 Lecture, Topic 5.3 Demonstration||worksheet20||lec20-slides|
Lecture 21: Eureka-style Speculative Task Parallelism
Lecture 22: Read-Write Isolation, Atomic variables
|Topic 5.4 Lecture, Topic 5.4 Demonstration, Topic 5.5 Lecture, Topic 5.5 Demonstration, Topic 5.6 Lecture, Topic 5.6 Demonstration||worksheet22||lec22-slides|
Homework 3, Lecture & demo quizzes for topics 5.1 to 5.6
Lecture 23: Actors
|Topic 6.1 Lecture, Topic 6.1 Demonstration, Topic 6.2 Lecture, Topic 6.2 Demonstration, Topic 6.3 Lecture, Topic 6.3 Demonstration||worksheet23||lec23-slides|
Lecture 24: Actors (contd)
|Topic 6.6 Lecture, Topic 6.6 Demonstration||worksheet24||lec24-slides|
Lecture 25: Concurrent Objects, Linearizability of Concurrent Objects
|Topic 6.4 Lecture, Topic 6.4 Demonstration, Topic 6.5 Lecture, Topic 6.5 Demonstration, Topic 7.4 Lecture||worksheet25||lec25-slides|
Lecture & demo quizzes for topics 6.1 - 6.6, 7.4
Lecture 26: Intro to Java Threads
|Topic 7.1 Lecture||worksheet26||lec26-slides|
Lecture 27: Java Threads (contd), Java synchronized statement
|Topic 7.2 Lecture||worksheet27||lec27-slides|
Lecture 28: Java synchronized statement (contd), advanced locking
|Topic 7.3 Lecture||worksheet28||lec28-slides|
Lecture & demo quizzes for topics 7.1, 7.2, 7.3
Lecture 29: Safety and Liveness Properties
|Topic 7.5 Lecture||worksheet29||lec29-slides|
Lecture 30: Dining Philosophers Problem
|Topic 7.6 Lecture||worksheet30||lec30-slides|
|Lecture & demo quizzes for topics 7.5, 7.6|
Lecture 31: Task Affinity with Places
Lecture 32: Apache Spark framework for cluster computing
Lecture 33: Message Passing Interface (MPI)
Homework 4 (now due by 11:59pm on April 12th)
Lecture 34: Message Passing Interface (MPI, contd)
Lecture 35: PGAS languages
Lecture 36: Memory Consistency Models
Lecture 37: GPU Computing
Lecture 38: Fortress language
Lecture 39: Course Review (lectures 20-37), Last day of classes
|lec39-slides||Homework 5 (automatic extension till May 1)|
Scheduled final exam during 0900-1200 (Herzstein Hall Amphitheatre)
Infrastructure setup, Async-Finish Parallel Programming
Abstract performance metrics with async & finish
Futures and Data Race detection
|lab3-handout||lab_3_futures.zip and lab_3_datarace.zip|
Real Performance from Finish Accumulators and Loop-Level Parallelism
|lab4-handout and lab4-slides||lab_4_forall.zip and lab_4_hjviz.zip|
Loop Chunking and Barrier Synchronization
|lab5-handout and lab5-slides||lab_5_onedimavg.zip|
Futures vs. Data-Driven Futures
|lab6-handout and lab6-slides||lab_6_ddfs_and_futures.zip|
Unix / Command line Basics, Phasers
|lab7-handout and lab7-slides||lab_7.zip|
No lab this week — Spring Break
Eureka-style Speculative Task Parallelism
Isolated Statement and Atomic Variables
|lab11-handout and lab11-slides||lab_11_threads.zip|
|lab12-handout and lab12-slides|
|14||Apr 22||Message Passing Interface (MPI)||lab14-handout|
Grading will be based on your performance on five homeworks (weighted 40% in all), two exams (weighted 20% each), weekly lab exercises (weighted 10% in all), and class participation including worksheets, in-class Q&A, Piazza participation, and online quizzes (weighted 10% in all).
The purpose of the homeworks is to train you to solve problems and to help deepen your understanding of concepts introduced in class. Homeworks are due on the dates and times specified in the course schedule. Please turn in all your homeworks using the subversion system set up for the class. Homework is worth full credit when turned in on time. A 10% penalty per day will be levied on late homeworks, up to a maximum of 6 days. No submissions will be accepted more than 6 days after the due date.
As in COMP 321, all students will be given 3 slip days to use throughout the semester. When you use a slip day, you will receive up to 24 additional hours to complete the assignment. You may use these slip days in any way you see fit (3 days on one assignment, 1 day each on 3 assignments, etc.). The only requirement for use of your slip days is that you e-mail the instructors prior to the time the assignment is due. On group projects, each student in the group must use a slip day in order to extend the deadline for the assignment. When slip days are used, you should clearly indicate so at the beginning of the assignment writeup. Other than slip days, no extensions will be given unless there are exceptional circumstances (such as severe sickness, not because you have too much other work). Such extensions must be requested and approved by the instructor (via e-mail, phone, or in person) before the due date for the assignment. Last minute requests are likely to be denied.
You will be expected to follow the Honor Code in all homeworks and exams. All submitted homeworks are expected to be the result of your individual effort. You are free to discuss course material and approaches to homework problems with your other classmates, the teaching assistants and the professor, but you should never misrepresent someone else’s work as your own. If you use any material from external sources, you must provide proper attribution ( as shown here). Exams 1 and 2 test your individual understanding and knowledge of the material. Exams are closed-book, and collaboration on exams is strictly forbidden. Finally, it is also your responsibility to protect your homeworks and exams from unauthorized access.
Graded homeworks will be returned to you via email, and exams as marked-up hardcopies. If you believe we have made an error in grading your homework or exam, please bring the matter to our attention within one week.
Students with disabilities are encouraged to contact me during the first two weeks of class regarding any special needs. Students with disabilities should also contact Disabled Student Services in the Ley Student Center and the Rice Disability Support Services.