Large Scale Computing

The course everyone should learn!

Course Overview


Large scale computing is now widespread used in many domains, covering HPC (High performance Computing), Deep Learning, and the Internet based platform companies (Like Amazon, Google, Alibaba, MeiTuan, JD, ...). This course tries to cover those topics in an interesting way providing a big picture as follows:

  • Introduction with 3 applications (Weather Forecasting, CNN/DL, MiaoSha)
  • Part I Fundamentals
    • How to solve Weather Forecasting based on a simplified Heat Equation
    • What does a program for large scale computing look like?
    • Evolution of computing systems (SMP, Vector Processor, GPU, MPP, Cluster, etc.)
    • System software to support the execution of the programs for large scale computing
  • Part II Programming for high performance single computer
    • OpenMP, MPI, CUDA (CNN/DL is shown here), MR, Spark etc.
  • Part III Programming for High performance system with many computers
    • Practical skills in real big e-commerce (to cater for "秒杀" and "Precision adv.")
  • Part IV Appendices

Schedule

Lectures Description Course Materials
1 Introduction

Why and how do I demonstrate Large Scale Computing?

HPC, DL, Business systems ... are now merging as one - Large Scale Computing

[Introduction]
2 A Weather Forecasting
  • Brief history of Weather forcasting
  • Numeric computing for Weather forecasting
  • Sequential code in Python
  • What does a LSC program look like
[A Weather Forecasting][in Python]
3 LSC systems
  • 3 trends to reach high performance computer systems
[Evolution of LSC systems]
4 DOS?
  • Distributed OS? - kind a ...
  • Modern OS is OK for single computer system - multi-core or with GPGPU
  • To support the concurrent execution of many programs in MPP/Cluster, it's quite different! - Protocol stack
[DOS?-Protocol stack][DOS?-other examples]
5 Overview of programming frameworks OpenMP, MPI, CUDA, MR/Spark/... from Big Data [Overview]
5 Brief OpenMP Brief OpenMP [OpenMP]
5 Brief MPI Brief MPI [MPI]
5 Brief CUDA Brief CUDA [CUDA]
5 Brief Big Data programming Brief Big Data programming [MR][Spark][...]
6 "秒杀" - Second Kill System Integration programming to solve "Second Kill" [Second Kill]
6 Brief Computing Adv Brief Computing Adv [Brief Computing Adv]
7 Review Review [Review]

Course Logistics and Policies


Grading Projects: 60% (4 times), Research Paper Reading: 20%, New tech investigation: 20%.

Textbook There is no required textbook, but for students who want additional resources, we recommend the following two:
  • John D. Cox 闻新宇 贾喆 朱清照. 风暴守望者 天气预报风云史 [M]. Wiley. 2002.
  • Thomas Sterling, Matthew Anderson, Maciej Brodowicz. High Performance Computing: Modern Systems and Practices [M]. 2017.
  • 陈国良 . 并行计算:结构、算法、编程 [M]. 高等教育出版社. 2003.
  • Andrew S Tanenbaum, Maarten Van Steen. Distributed Systems 3rd edition . 2017.
  • 厄兹叙 (M.Tamer Ozsu), Patrick Valduriez 著. 周立柱, 范举 译. 分布式数据库系统原理. 清华大学出版社. 2014.
  • 于戈 申德荣等. 分布式数据库系统:大数据时代新型数据库技术(第2版).机械工业出版社. 2021.
  • Ian Goodfellow, Yoshua Bengio, Aaron Courville. Deep Learning.
  • 刘鹏, 王超 . 计算广告. 人民邮电出版社. 2015.