Prof. Vivek Sarkar, DH 3131
|Head TA:||Max Grossman|
Dr. Mackale Joyner
Jonathan Sharman, Ryan Spring, Bing Xue, Lechen Yu
|Admin Assistant:||Annepha Hurlock, firstname.lastname@example.org, DH 3080, 713-348-5186||Undergraduate TAs:|
Marc Canby, Anna Chi, Peter Elmers, Joseph Hungate, Cary Jiang, Gloria Kim, Kevin Mullin, Victoria Nazari, Ashok Sankaran, Sujay Tadwalkar, Anant Tibrewal, Eugene Wang, Yufeng Zhou
https://piazza.com/class/ixdqx0x3bjl6en (Piazza is the preferred medium for all course communications, but you can also send email to comp322-staff at rice dot edu if needed)
Herzstein Hall 210
MWF 1:00pm - 1:50pm
DH 1064, DH 1070
Wednesday, 07:00pm - 08:30pm
A summary PDF file containing the course syllabus for the course can be found here. Much of the syllabus information is also included below in this course web site, along with some additional details that are not included in the syllabus.
The primary goal of COMP 322 is to introduce you to the fundamentals of parallel programming and parallel algorithms, by following a pedagogic approach that exposes you to the intellectual challenges in parallel software without enmeshing you in the jargon and lower-level details of today's parallel systems. A strong grasp of the course fundamentals will enable you to quickly pick up any specific parallel programming system that you may encounter in the future, and also prepare you for studying advanced topics related to parallelism and concurrency in courses such as COMP 422.
The desired learning outcomes fall into three major areas (course modules):
1) Parallelism: creation and coordination of parallelism (async, finish), abstract performance metrics (work, critical paths), Amdahl's Law, weak vs. strong scaling, data races and determinism, data race avoidance (immutability, futures, accumulators, dataflow), deadlock avoidance, abstract vs. real performance (granularity, scalability), collective & point-to-point synchronization (phasers, barriers), parallel algorithms, systolic algorithms.
2) Concurrency: critical sections, atomicity, isolation, high level data races, nondeterminism, linearizability, liveness/progress guarantees, actors, request-response parallelism, Java Concurrency, locks, condition variables, semaphores, memory consistency models.
3) Locality & Distribution: memory hierarchies, locality, cache affinity, data movement, message-passing (MPI), communication overheads (bandwidth, latency), MapReduce, accelerators, GPGPUs, CUDA, OpenCL.
To achieve these learning outcomes, each class period will include time for both instructor lectures and in-class exercises based on assigned reading and videos. The lab exercises will be used to help students gain hands-on programming experience with the concepts introduced in the lectures.
To ensure that students gain a strong knowledge of parallel programming foundations, the classes and homeworks will place equal emphasis on both theory and practice. The programming component of the course will mostly use the Habanero-Java Library (HJ-lib) pedagogic extension to the Java language developed in the Habanero Extreme Scale Software Research project at Rice University. The course will also introduce you to real-world parallel programming models including Java Concurrency, MapReduce, MPI, OpenCL and CUDA. An important goal is that, at the end of COMP 322, you should feel comfortable programming in any parallel language for which you are familiar with the underlying sequential language (Java or C). Any parallel programming primitives that you encounter in the future should be easily recognizable based on the fundamentals studied in COMP 322.
The prerequisite course requirements are COMP 182 and COMP 215. COMP 322 should be accessible to anyone familiar with the foundations of sequential algorithms and data structures, and with basic Java programming. COMP 321 is also recommended as a co-requisite.
There are no required textbooks for the class. Instead, lecture handouts are provided for each module as follows. You are expected to read the relevant sections in each lecture handout before coming to the lecture. We will also provide a number of references in the slides and handouts.The links to the latest versions of the lecture handouts are included below:
There are also a few optional textbooks that we will draw from during the course. You are encouraged to get copies of any or all of these books. They will serve as useful references both during and after this course:
Finally, here are some additional resources that may be helpful for you:
Assigned Videos (see Canvas site for video links)
Lecture 1: Task Creation and Termination (Async, Finish)
|Module 1: Section 1.1||worksheet1||lec1-slides|
Lecture 2: Computation Graphs, Ideal Parallelism
|Module 1: Sections 1.2, 1.3||Topic 1.2 Lecture, Topic 1.2 Demonstration, Topic 1.3 Lecture, Topic 1.3 Demonstration||worksheet2||lec2-slides|
|Fri||Jan 13||Lecture 3: Abstract Performance Metrics, Multiprocessor Scheduling||Module 1: Section 1.4||Topic 1.4 Lecture, Topic 1.4 Demonstration||worksheet3||lec3-slides|
No lecture, School Holiday (Martin Luther King, Jr. Day)
Lecture 4: Parallel Speedup and Amdahl's Law
|Module 1: Section 1.5||Topic 1.5 Lecture, Topic 1.5 Demonstration||worksheet4||lec4-slides|
Lecture 5: Future Tasks, Functional Parallelism ("Back to the Future")
|Module 1: Section 2.1||Topic 2.1 Lecture, Topic 2.1 Demonstration||worksheet5||lec5-slides|
Lecture 6: Memoization
|Module 1: Section 2.2||Topic 2.2 Lecture, Topic 2.2 Demonstration||worksheet6||lec6-slides|
Lecture 7: Finish Accumulators
|Module 1: Section 2.3||Topic 2.3 Lecture, Topic 2.3 Demonstration||worksheet7||lec7-slides||Homework 1|
Lecture 8: Map Reduce
|Module 1: Section 2.4||Topic 2.4 Lecture, Topic 2.4 Demonstration||worksheet8||lec8-slides|
|Quiz for Unit 1|
Lecture 9: Data Races, Functional & Structural Determinism
|Module 1: Sections 2.5, 2.6||Topic 2.5 Lecture, Topic 2.5 Demonstration, Topic 2.6 Lecture, Topic 2.6 Demonstration||worksheet9||lec9-slides|
|Lecture 10: Java’s Fork/Join Library||Module 1: Sections 2.7, 2.8||Topic 2.7 Lecture, Topic 2.8 Lecture,||worksheet10||lec10-slides|
Lecture 11: Loop-Level Parallelism, Parallel Matrix Multiplication, Iteration Grouping (Chunking)
|Module 1: Sections 3.1, 3.2, 3.3|
Topic 3.1 Lecture , Topic 3.1 Demonstration , Topic 3.2 Lecture, Topic 3.2 Demonstration, Topic 3.3 Lecture , Topic 3.3 Demonstration
Lecture 12: Barrier Synchronization
|Module 1: Section 3.4||Topic 3.4 Lecture , Topic 3.4 Demonstration|
Lecture 13: Parallelism in Java Streams, Parallel Prefix Sums
|Quiz for Unit 2|
Lecture 14: Iterative Averaging Revisited, SPMD pattern
|Module 1: Sections 3.5, 3.6||Topic 3.5 Lecture , Topic 3.5 Demonstration , Topic 3.6 Lecture, Topic 3.6 Demonstration|
Lecture 15: Data-Driven Tasks, Point-to-Point Synchronization with Phasers
|Module 1: Sections 4.5, 4.2, 4.3||Topic 4.5 Lecture Topic 4.5 Demonstration, Topic 4.2 Lecture , Topic 4.2 Demonstration, Topic 4.3 Lecture, Topic 4.3 Demonstration|
Lecture 16: Phasers Review
|Module 1: Sections 4.2||Topic 4.2 Lecture , Topic 4.2 Demonstration||Quiz for Unit 3|
Lecture 17: Midterm Summary
Midterm Review (interactive Q&A, no lecture)
|Exam 1 held during lab time (7:00pm - 10:00pm), scope of exam limited to lectures 1-16|
Lecture 18: Abstract vs. Real Performance
|Homework 3, Checkpoint-1|
Lecture 19: Pipeline Parallelism, Signal Statement, Fuzzy Barriers
|Module 1: Sections 4.4, 4.1||Topic 4.4 Lecture , Topic 4.4 Demonstration, Topic 4.1 Lecture, Topic 4.1 Demonstration,|
Lecture 20: Critical sections, Isolated construct, Parallel Spanning Tree algorithm, Atomic variables (start of Module 2)
|Module 2: Sections 5.1, 5.2, 5.3, 5.4, 5.6|
Topic 5.1 Lecture, Topic 5.1 Demonstration, Topic 5.2 Lecture, Topic 5.2 Demonstration, Topic 5.3 Lecture, Topic 5.3 Demonstration, Topic 5.4 Lecture, Topic 5.4 Demonstration, Topic 5.6 Lecture, Topic 5.6 Demonstration
Lecture 21: Read-Write Isolation, Review of Phasers
|Module 2: Section 5.5||Topic 5.5 Lecture, Topic 5.5 Demonstration|
Quiz for Unit 4
Lecture 22: Actors
|Module 2: 6.1, 6.2||Topic 6.1 Lecture , Topic 6.1 Demonstration , Topic 6.2 Lecture, Topic 6.2 Demonstration|
Lecture 23: Actors (contd)
|Module 2: 6.3, 6.4, 6.5, 6.6||Topic 6.3 Lecture, Topic 6.3 Demonstration, Topic 6.4 Lecture , Topic 6.4 Demonstration, Topic 6.5 Lecture, Topic 6.5 Demonstration, Topic 6.6 Lecture, Topic 6.6 Demonstration|
Homework 3, Checkpoint-2
Lecture 24: Java Threads, Java synchronized statement
|Module 2: 7.1, 7.2||Topic 7.1 Lecture, Topic 7.2 Lecture||Quiz for Unit 5|
Mar 13 - Mar 17
Lecture 25: Java synchronized statement (contd), wait/notify
|Module 2: 7.2||Topic 7.2 Lecture|
Lecture 26: Java Locks, Linearizability of Concurrent Objects
|Module 2: 7.3, 7.4||Topic 7.3 Lecture, Topic 7.4 Lecture|
(includes one intermediate checkpoint)
|Homework 3 (all)|
Lecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem
|Module 2: 7.5, 7.6||Topic 7.5 Lecture, Topic 7.6 Lecture||worksheet27|
Quiz for Unit 6
Lecture 28: Message Passing Interface (MPI), (start of Module 3)
|Topic 8.1 Lecture, Topic 8.2 Lecture, Topic 8.3 Lecture,||worksheet28|
Lecture 29: Message Passing Interface (MPI, contd)
|Topic 8.4 Lecture, Topic 8.5 Lecture, Topic 8 Demonstration Video|
Lecture 30: Apache Hadoop and Spark frameworks for Map-Reduce
|Topic 9.1 Lecture (optional, overlaps with video 2.4), Topic 9.2 Lecture, Topic 9.3 Lecture||Quiz for Unit 7|
Lecture 31: TF-IDF and PageRank Algorithms with Map-Reduce
|Topic 9.4 Lecture, Topic 9.5 Lecture, Unit 9 Demonstration|
Lecture 32: Combining Distribution and Multithreading
|Lectures 10.1 - 10.5, Unit 10 Demonstration (optional, unit 10 has no quiz)|
Homework 4 Checkpoint-1
Lecture 33: Eureka-style Speculative Task Parallelism
Quiz for Unit 8
Lecture 34: Task Affinity with Places
Lecture 35: Partitioned Global Address Space (PGAS) programming models
(Due April 21st, with automatic extension until May 1st after which slip days may be used)
Homework 4 (all)
Lecture 36: Algorithms based on Parallel Prefix (Scan) operations
|Quiz for Unit 9|
Lecture 37: GPU Computing
|Lecture 38: Topic TBD|
Lecture 39: Course Review (lectures 19 - 38), Last day of classes
Homework 5 (automatic extension until May 1st, after which slip days may be used)
|-||Mon||Apr 24||Review session / Office Hours, 1pm - 3pm, location TBD|
|-||Wed||Apr 26||Review session / Office Hours, 1pm - 3pm, location TBD|
|-||Fri||Apr 28||Review session / Office Hours, 1pm - 3pm, location TBD|
9am - 12noon, scheduled final exam (Exam 2 – scope of exam limited to lectures 19 - 38), location TBD by registrar
Async-Finish Parallel Programming with abstract metrics
Futures and HJ-Viz
Cutoff Strategy and Real World Performance
Java's ForkJoin Framework
No lab this week — Exam 1
Isolated Statement and Atomic Variables
No lab this week — Spring Break
Java Threads, Java Locks
No lab this week — Willy Week!
Message Passing Interface (MPI)
Eureka-style Speculative Task Parallelism
Grading will be based on your performance on five homeworks (weighted 40% in all), two exams (weighted 40% in all), weekly lab exercises (weighted 10% in all), online quizzes (weighted 5% in all), and class participation including in-class Q&A, worksheets, Piazza participation (weighted 5% in all).
The purpose of the homeworks is to give you practice in solving problems that deepen your understanding of concepts introduced in class. Homeworks are due on the dates and times specified in the course schedule. No late submissions (other than those using slip days mentioned below) will be accepted.
The slip day policy for COMP 322 is similar to that of COMP 321. All students will be given 3 slip days to use throughout the semester. When you use a slip day, you will receive up to 24 additional hours to complete the assignment. You may use these slip days in any way you see fit (3 days on one assignment, 1 day each on 3 assignments, etc.). Slip days will be automatically tracked through the Autograder, more details are available later in this document and in the Autograder user guide. Other than slip days, no extensions will be given unless there are exceptional circumstances (such as severe sickness, not because you have too much other work). Such extensions must be requested and approved by the instructor (via e-mail, phone, or in person) before the due date for the assignment. Last minute requests are likely to be denied.
Labs must be checked off by a TA prior to the start of the lab the following week.
Worksheets should be completed in class for full credit. For partial credit, a worksheet can be turned in before the start of the class following the one in which the worksheet for distributed, so that solutions to the worksheets can be discussed in the next class.
You will be expected to follow the Honor Code in all homeworks and exams. The following policies will apply to different work products in the course:
Graded homeworks will be returned to you via email, and exams as marked-up hardcopies. If you believe we have made an error in grading your homework or exam, please bring the matter to our attention within one week.
Students with disabilities are encouraged to contact me during the first two weeks of class regarding any special needs. Students with disabilities should also contact Disabled Student Services in the Ley Student Center and the Rice Disability Support Services.