Schedule
Course Outline
This course is designed to simplify the complex world of data analysis. It starts by teaching the basics of statistical inference, making smart guesses and decisions based on data. It then shifts to big data analysis, exploring how to handle, analyze, and draw conclusions from large datasets using modern techniques and tools. This course is ideal for anyone looking to make informed decisions using data, blending foundational statistical concepts with practical big data applications in a clear, straightforward manner.
Prerequisites
- Knowledge in linear regression analysis, statistical inference, and linear algebra.
- Basic working knowledge in a scientific programming language (e.g., Python, Matlab, R, etc).
- All course examples will be in Python.
Textbooks and References
- *Introduction to Linear Regression Analysis*, Fifth Edition by Montgomery, Peck, and Vining. ISBN: 978-0-470-54281-1
- *An Introduction to Statistical Learning* by James, Witten, Hastie, and Tibshirani
- *The Elements of Statistical Learning: Data Mining, Inference, and Prediction* by Hastie, Tibshirani, and Friedman
- *Mining of Massive Datasets* by Jure Leskovec, Anand Rajaraman, Jeff Ullman
This course also requires the use of the following statistical and typesetting software:
- [Anaconda](https://www.anaconda.com)
- [Jupyterlab](https://jupyter.org)
Additional course material and reading assignments will be provided via instructor notes and recent journal articles.
Grading
- Attendance: 10%
- Quizzes: 20%
- Assignments: 30%
- Midterm Exam: 10%
- Final Exam: 10%
- Projects: 20%
Additional Information
- Course materials will be available on the class webpage.
- Pre-class assignments (reading and coding) are required to prepare for the lectures. Failure to complete these may result in a lower participation grade.
Contact
For more information or inquiries about the course, please contact Prof. Hyunglok Kim at hyunglokkim@gist.ac.kr.