COL729 : Compiler Optimizations : About

Course Staff

Teaching Assistants (TAs):

Shubhani
Ankush Phulia

Lectures: Mon, Thu. 8-9.30 (Slot A), SIT 006

Pre-requisites

This course will briefly sketch the fundamentals of compiler design and will quickly move to the primary focus of the course: compiler optimizations. We recommend that you should have done a compiler-design course beforehand. However if you have not taken that course but are very interested in taking this course, please write to me with your CV/resume and I can guide you based on your background. You should have at least done a UG course on logic and/or programming languages.

The official pre-requisites are (see courses of study):

COL 216 : Computer Architecture
COL 226 : Programming Languages

or equivalent.

Why study compiler optimization?

How compiler optimizations are central to further progress in computer science: Compilers bridge the gap between what humans find easy to understand and what machines find easy to execute. There is perhaps no CS area that does not need this important capability in today's research scenario:

Domain specific languages for writing operating systems, AI and machine learning algorithms, are very popular in the research community. The only way to harness their potential is through great compiler support.
Gaps between safe languages (like Java/Python/OCaml) and performant languages (like C/C++) depend on the power of the compilation infrastructure
Efficiently harnessing the compute power of modern devices like multi-core CPUs, GPUs, FPGAs, etc., is tightly dependent on compiler support.

Some discussion on compiler optimizations in recent Turing award lectures is available here and here.
My opinion: Perhaps the most important innovations in the next decade in CS will be primarily based on advances in compilation support. See references page for some more pointers.

How it improves your understanding of computer science: Compilers is perhaps the only CS area that combines a rich knowledge of logic, algorithms, theory, and low-level system programming, design and performance optimization. Very commonly, logicians derive research problems from real-world compiler problems; also, system builders and researchers find some of the most intriguing and difficult problems in compiler design.

Course topics

Following are the tentative course topics; the exact topics will evolve as we go along the course

Data-flow analyses : basic mathematical framework, examples like global constant propagation, liveness, reaching definitions, single static assignment, common subexpression elimination, alias analysis, lazy code motion, and perhaps some more.
Region-based analysis : natural loops, induction variable analysis, strength reduction based on induction variable analysis, auto-parallelization based on induction-variable analysis, reduction operations, and perhaps some more.
Inter-procedural analysis : inlining, inter-procedural data-flow analysis, function summaries, partial inlining.
Undefined and unspecified behaviour semantics and their effects on compiler optimization
Register allocation : graph-based techniques, enumerative techniques, performance studies
Peephole optimizations : examples from multiple architectures, superoptimization techniques to learn peephole optimizations (GNU Superoptimizer, Peephole Superoptimizer).
Code synthesis and superoptimization : study some state-of-the-art superoptimizers and their pros and cons, e.g. Embecosm's GNU Superoptimizer 2.0, another effort by Embecosm.
Non-bisimilar loop transformations, such as loop tiling, loop interchange, blocking, for better locality and parallelism.
Polyhedral optimization framework, e.g., Polly.
Domain-specific languages and compilers, e.g., machine-learning, high-performance networking, and some more
Empirical studies on compiler optimization

Lab assignments

We will have assignments primarily based on the LLVM IR to supplement the course material. The assignment load is expected to be moderate.