6.338J/18.337J Lecture Videos

Lecture Videos - Spring 2005

The lectures are not being taped this year. Suggested video sequences from previous years are shown below.

2002 videos

2003 videos

Lecture 1 (2/3) - Introduction

Lecture 2/4/2003
- Course admin
- Parallel Computer Architectures
- Top 500
- Horst Simon's talk
- Performance trends
- Jack Dongarra's notes on Earth Simulator
- Beowulf clusters
- The last few minutes of the lecture are broken

Lecture 2 (2/5) - MPI, OpenMP, and STAR-P

Lecture 2/11/2003 MPI
- 03:45 - 05:10 Philisophy of MPI
- 05:23 - 08:00 Old Classification: SIMD/MIMD/SPMD
- 08:00 - 09:09 Cooperative (Send/receive) vs non-cooperative (put/get)
- 15:55 - 17:30 First MPI example - MPI_Init and MPI_Finalize
- 17:30 - 20:00 How to compile and run
- 20:00 - 22:40 MPI_Comm_rank and size
- 22:40 - 30:00 MPI_Send and Receive
- 30:00 - 35:00 Broadcast and Reduce
- 35:00 - 41:00 Scan (and other related prefix like functions)
- 41:00 - 46:00 Block vs non-blocking
Lecture 2/13/2002 OpenMP
- 12:00 - 12:40 Connecting to the class machine
- 12:40 - 23:30 OpenMP introduction
- 23:30 - 55:20 OpenMP details and example
- 55:20 - 59:30 Ron: running OpenMP on our beowulf
Lecture 2/6/2003 STAR-P
- 21:30 - 39:30 MATLAB*P Data parallel mode
- 39:30 - 55:00 MATLAB*P MM-mode
Total time: 123:15

Lecture 3 (2/10) - Parallel Prefix

Lecture 2/19/2002
- 07:00 - 15:00 Start
- 15:00 - 26:00 Parallel Prefix Algorithm
- 26:00 - 42:00 Associative Operations
- 42:00 - 47:00 The "Myth" of log n
- 47:00 - 54:00 Prefix Operation Segmented
- 54:00 - 61:00 Fortran
- 61:00 - 70:00 Parallel Prefix Variations
- 70:00 - 84:00 PRAM

Lecture 4 (2/12) - Dense Linear Algebra I

Lecture 2/25/2002

0:07:00 - 0:14:00 Start
0:14:00 - 0:27:00 Linear Algebra Libraries
0:27:00 - 0:29:00 History, Top 500 ...
0:29:00 - 0:39:00 Optimizing Computation and Memoery Use
0:39:00 - 0:43:00 BLAS
0:43:00 - 0:47:00 Self Adapting Numerical Software
0:47:00 - 0:53:00 Software Generation Strategy
0:53:00 - 1:01:00 Gaussian Elimination
1:01:00 - 1:04:00 Distributed and Parallel Systems
1:04:00 - 1:07:00 Three Basic Linear Algebra Problems
1:07:00 - 1:08:00 Results for Parallel Implementation on Intel Delta
1:08:00 - 1:26:00 More on Gaussian Elimination (Reorganization)
1:26:00 - 1:28:00 ScaLAPACK and MATLAB*P

Lecture 5 (2/19) - Dense Linear Algebra II

Lecture 2/27/2002

0:07:00 - 0:08:00 Linear Algebra: Start
0:08:00 - 0:11:00 Linear Algebra: Fundamental Triangle
0:11:00 - 0:12:00 Linear Algebra: Algorithm & Architecture
0:12:00 - 0:16:00 Linear Algebra: Architecture ...
0:16:00 - 0:20:00 Dense Linear Algebra
0:20:00 - 0:22:00 Linear Algebra: Basic Algorithm Change
0:22:00 - 0:29:00 Linear Algebra: FMA Instruction
0:29:00 - 0:30:00 Blocking
0:30:00 - 0:34:00 Linear Algebra: Recursion
0:34:00 - 0:35:00 Block Column Major Order
0:35:00 - 0:39:00 Linear Algebra: Square Block ...
0:39:00 - 0:42:00 Blocked Mat-Mult is Optimal
0:42:00 - 0:44:00 Linear Algebra: Matrix Multiplication is Pervasive
0:44:00 - 0:48:00 Linear Algebra: Recursion
0:48:00 - 0:49:00 Linear Algebra: Standard Full Format
0:49:00 - 0:51:00 Linear Algebra: Block Hybrid Full Format
0:51:00 - 0:55:00 Linear Algebra: Blocked Based Algorithms Via LAPACK
0:55:00 - 0:57:00 Linear Algebra: Concise Algorithms Emerge
0:57:00 - 1:01:00 Linear Algebra: Tree Diagram of Cholesky Algorithm
1:01:00 - 1:04:00 Linear Algebra: Challenge of Machine Independent Design of Dense Linear Algebra Codes via the BLAS
1:04:00 - 1:06:00 Linear Algebra: Can we exploit this general relationship?
1:06:00 - 1:07:00 Recursive Data Format
1:07:00 - 1:11:00 Linear Algebra: Dimension Theory
1:11:00 - 1:12:00 Linear Algebra: Changes
1:12:00 - 1:13:00 Linear Algebra: New LAPACK Type Routine
1:13:00 - 1:20:00 Linear Algebra: Answering questions from audience

Lecture 6 (2/24) - Sparse Linear Algebra I

Lecture 4/1/2002

0:00:00 - 0:04:30 Start, Overview
0:04:30 - 0:10:20 MATLAB Demo
0:10:20 - 0:20:00 Sparse Matrices in Real Life
0:20:00 - 0:21:00 MATLAB Matrices: Design Principles
0:21:00 - 0:22:30 Data Structures
0:22:30 - 0:34:00 Algorithms (Ax=b)
0:34:00 - 0:35:20 Solving Linear Equations: x=A\b
0:35:20 - 0:41:30 Graphs and Sparse Matrices: Cholesky Factorization
0:41:30 - 0:46:00 Elimination Tree
0:46:00 - 1:01:00 MATLAB Demo: Sparse Matrices and Graphs
1:01:00 - 1:06:10 Fill Reducing Matrix Permutations
1:06:10 - 1:12:15 Matching and Block Triangular Form
1:12:15 - 1:16:46 Complexity of Direct Methods
1:16:46 - 1:25:00 The Landscape of Sparse Ax=b Solvers

Lecture 7 (2/26) - Parallel Computer Architecture

Lecture 2/20/2002

0:07:00 - 0:08:00 Start(Parallel Computer Architecture I)
0:08:00 - 0:14:00 Latency/Bandwidth
0:14:00 - 0:23:00 Latency: the details
0:23:00 - 0:31:00 Node Architecutre
0:31:00 - 0:36:00 Bus as an interconnect network
0:36:00 - 0:40:00 Asic White at Lawrence Livermore
0:40:00 - 0:48:00 Cross Bar
0:48:00 - 0:55:00 Cache/Cache Coherence
0:55:00 - 1:09:00 CM-2
1:09:00 - 1:20:00 Hypercube
1:20:00 - 1:23:00 Routing
1:23:00 - 1:25:00 Paradiso Cafe Problem

Lecture 8 (3/2) - STAR-P Demo

(No videos)

Lecture 9 (3/4) - Sparse Linear Algebra II

Lecture 4/3/2002

0:00:00 - 0:10:00 Start, related webpages
0:10:00 - 0:15:00 Direct Methods
0:15:00 - 0:17:00 GEPP: Gaussian elimination w/ partial pivoting
0:17:00 - 0:21:30 Symmetric Positive Definite: A=R'R
0:21:30 - 0:26:00 Symbolic Gaussian Elimination
0:26:00 - 0:27:30 Sparse Triangular Solve
0:27:30 - 0:33:00 Left-looking Column LU Factorization
0:33:00 - 0:35:30 Symmetric Supernodes
0:35:30 - 0:38:40 Nonsymmetric Supernodes
0:38:40 - 0:40:50 Sequential SuperLU
0:40:50 - 0:44:40 Column Elimination Tree
0:44:40 - 0:47:00 Shared Memory SuperLU
0:47:00 - 0:48:40 Column Preordering for Sparsity
0:48:40 - 0:53:30 SuperLU dist: GE with static pivoting
0:53:30 - 0:58:00 Row permutation for heavy diagonal
0:58:00 - 1:05:30 Iterative refinement to improve solution
1:05:30 - 1:08:00 Question: preordering for static pivoting
1:08:00 - 1:15:30 Symmetric-pattern multifrontal factorization
1:15:30 - 1:18:40 MUMPS: distributed memory multifrontal
1:18:40 - 1:24:15 Remark on (nonsymmetric) direct methods

Lecture 10 (3/9) - Fast Fourier Transform

Lecture 4/29/2002
- 0:00:00 - 0:01:15 Start, topic: Fast Fourier Transform (FFT)
- 0:01:15 - 0:05:15 Definition of discrete Fourier transform
- 0:05:15 - 0:09:25 Pictures of FFTs
- 0:09:25 - 0:23:52 Example of phone tones
- 0:23:52 - 0:26:57 The FFT algorithm
- 0:26:57 - 0:29:30 Unshuffle
- 0:29:30 - 0:31:20 The matrix recurrence for Fn
- 0:31:20 - 0:37:00 Recursive form of the algorithm
- 0:37:00 - 0:44:45 Unwrapping the recurrence: bit reversal
- 0:44:45 - 0:47:45 Books on the FFT
- 0:47:45 - 0:53:00 Performance issues: the butterfly
- 0:53:00 - 0:57:20 Parallel issues: communication
- 0:57:20 - 1:05:30 notation: putting bars on digits, hypercube FFT
- 1:05:30 - 1:11:00 Back to parallel communication issues
- 1:11:00 - 1:21:07 Detailed look at a parallel FFT of size 32 on 4 processors
- 1:21:07 - 1:22:30 FFTW preview: fastest FFT in the West

Lecture 11 (3/11) - Domain Decomposition

Lecture 4/17/2002
- 0:00:00 - 0:01:50 Domain Decomposition (DD)
- 0:01:50 - 0:10:00 Overlapping case of DD, example in FEMLAB
- 0:10:00 - 0:19:30 Summary of overlapping DD
- 0:19:30 - 0:21:45 History: why did Schwarz do this in 1870?
- 0:21:45 - 0:30:00 Nonoverlapping DD, example in FEMLAB
- 0:30:00 - 0:39:00 Normal derivatives on the boundary
- 0:39:00 - 0:40:00 Summary of overlapping and nonoverlapping DD
- 0:40:00 - 0:42:45 Award ceremony
- 0:42:45 - 0:46:00 Discretization
- 0:46:00 - 0:52:30 The overlapping case: computational issues
- 0:52:30 - 1:00:40 Jacobi and Gauss-Seidel iterations
- 1:00:40 - 1:06:00 Overlappping DD as block Jacobi or block Gauss-Seidel
- 1:06:00 - 1:11:07 Linear algebra formulation of Jacobi and Gauss-Seidel
- 1:11:07 - 1:20:00 The nonoverlapping case: computational issues
- 1:20:00 - 1:23:45 Solving the Schur complement system iteratively

Lecture 14 (3/30) - Sparse Linear Algebra III

Lecture 4/8/2002

0:00:00 - 0:05:00 Start, Super LU-dist: iterative refinement
0:05:00 - 0:09:45 Convergence analysis of iterative refinement
0:09:45 - 0:14:45 The Landscape of Sparse Ax=b Solvers
0:14:45 - 0:22:00 Conjugate gradient iteration
0:22:00 - 0:31:15 Conjugate gradient: Krylov subspaces
0:31:15 - 0:37:15 Conjugate gradient: Convergence
0:37:15 - 0:47:15 Matlab demo
0:47:15 - 0:59:30 Conjugate gradient: Parallel implementation
0:59:30 - 1:07:00 Preconditioners
1:07:00 - 1:18:30 Incomplete Cholesky factorization
1:18:30 - 1:20:30 Sparse approximation
1:20:30 - 1:21:30 Support graph preconditioners: example
1:21:30 - 1:22:00 Multigrid
1:22:00 - 1:25:00 Complexity of direct method

Lecture 15 (4/1) - Fast Multipole I

Lecture 3/6/2002
- 0:08:00 - 0:09:00 Start
- 0:09:00 - 0:30:00 MATLAB*P Demo
- 0:30:00 - 0:34:00 N-Body Problem
- 0:34:00 - 0:40:00 N-Body Problem: What is the Computation?
- 0:40:00 - 0:42:00 N-Body Problem: O(n^2)? Right?
- 0:42:00 - 0:46:00 N-Body Problem: Variations
- 0:46:00 - 0:50:00 N-Body Problem: O(n^2) VS O(nlog(n))
- 0:50:00 - 0:53:00 N-Body Problem: nlog(n) Type of Computation
- 0:53:00 - 0:57:00 Data Structure: Quad-tree
- 0:57:00 - 1:00:00 Data Structure: Qct-tree
- 1:00:00 - 1:08:00 Barnes-Hut
- 1:08:00 - 1:11:00 Multipole (in 1D, constant potentials)
- 1:11:00 - 1:15:00 Multipole (in 1D, potential= quadratic polynomials)
- 1:15:00 - 1:17:00 Multipole (global coordinates vs local coordinates)
- 1:17:00 - 1:26:00 Multipole (Vi(x)=qi/(x-xi))
- 1:26:00 - 1:28:00 Multipole Series

Lecture 16 (4/6) - Fast Multipole II

Lecture 3/11/2002
- 0:00:00 - 0:02:00 Start: projects and homework schedule
- 0:02:00 - 0:07:00 Fast multipole: quick review; adding functions
- 0:07:00 - 0:12:30 Fast multipole: Analogy with finite precision arithmetic
- 0:12:30 - 0:21:00 Exclusion sum as matrix/vector multiplication
- 0:21:00 - 0:25:20 Exclusion sum as matrix decomposition
- 0:25:20 - 0:29:15 Multipole as matrix decomposition
- 0:29:15 - 0:32:00 Representing functions: Taylor series
- 0:32:00 - 0:45:30 Interpolating polynomials
- 0:45:30 - 0:52:00 Matlab: Symbolic example
- 0:52:00 - 1:04:30 Multipole series (as Taylor series in 1/x)
- 1:04:30 - 1:10:00 Virtual charges
- 1:10:00 - 1:17:45 Fitting a polynomial to the example
- 1:17:45 - 1:18:45 Adding representations of functions
- 1:18:45 - 1:20:10 Summary of the whole fast multipole algorithm

Lecture 17 (4/8) - Fast Multipole III

Lecture 3/13/2002
- 0:00:00 - 0:01:43 Start: setting up A/V
- 0:01:43 - 0:02:25 Tape running. Project proposal status
- 0:02:25 - 0:10:00 Multipole continued: object-oriented Matlab; exclude2d
- 0:10:00 - 0:23:50 Taylor series in Matlab by operator overloading
- 0:23:50 - 0:25:00 Exclusion sum on Taylor series
- 0:25:00 - 0:33:15 Multipole series in Matlab
- 0:33:15 - 0:41:00 Addition algorithms using binomial coefficients
- 0:41:00 - 0:45:00 Code for multipole addition; the Pascal matrix; example
- 0:45:00 - 0:50:00 Multipole objects with symbolic entries (!)
- 0:50:00 - 0:55:30 Multipole: summing up the multipole sum algorithm
- 0:55:30 - 1:23:00 Multipole: Overall summary, questions
- 1:23:00 - 1:25:10 Multipole: History and other applications

Lecture 19 (4/15) - Partitioning

Lecture 4/22/2002
- 0:00:00 - 0:03:30 Start, News on the Japanese fastest computer
- 0:03:30 - 0:05:30 Special Partitioning: One way to slice a problem in half
- 0:05:30 - 0:12:00 Laplacian of a graph
- 0:12:00 - 0:18:15 Edge-Node Incidence Matrix
- 0:18:15 - 0:27:30 Spectral Partitioning
- 0:27:30 - 0:39:15 Spectral Partitioning: Solve as an eigenvalue problem
- 0:39:15 - 0:46:30 Geometric Methods
- 0:46:30 - 0:50:00 Edge separator and Vertex Separator
- 0:50:00 - 0:52:45 Theory VS. Practice
- 0:52:45 - 0:59:00 Need Theoretical Class of Good Graphs
- 0:59:00 - 1:00:50 Geometric Separator Theorem
- 1:00:50 - 1:08:00 The Algorithm (Step 1 through 6)
- 1:08:00 - 1:18:45 Demo again and Radon Point
- 1:18:45 - 1:20:00 A few tricks
- 1:20:00 - 1:20:45 ParMETIS