Statistics for Environment and Energy Sciences Course at GIST (Spring 2025) EV3112

Schedule

Course Outline

This course provides a comprehensive introduction to statistical methods and their applications in environmental and energy studies. Students will begin with the basics of statistics, learning measures of location, variability, and distribution symmetry, as well as how to handle outliers and data transformations. The course then delves into the unique nature of environmental data, including remote sensing and model data, emphasizing the importance of time-series, spatial, and spatio-temporal data. Basic programming skills in Python, Jupyter Lab, Deepnote, and Bash will be developed to aid in data analysis. The course emphasizes graphical data analysis techniques such as histograms, boxplots, and scatterplots to visualize one or multiple datasets. Students will learn how to ensure the reliability of their results by understanding and interpreting interval estimates and confidence intervals. Hypothesis testing, a core component of statistical analysis, will be covered extensively, including tests for normality and the rank-sum test. The course also covers testing differences between two independent groups using various tests such as the t-test and permutation test. A mid-term exam will assess the students’ understanding up to this point. Further, the course will explore paired difference tests and comparing centers of several independent groups using methods like the Kruskal-Wallis test and analysis of variance (ANOVA). Correlation and trend analysis will be discussed to help students identify relationships and trends within their data. Finally, the course introduces deep learning concepts, including sequence modeling and computer vision, with applications in environmental data. Overall, this course equips students with essential statistical tools and techniques to analyze and interpret complex environmental and energy data effectively.

Prerequisites

• Knowledge in linear regression analysis, statistical inference, and linear algebra.

• Basic working knowledge in a scientific programming language (e.g., Python, R, etc).

• All course examples will be in Python.

Textbooks and References

Introduction to Linear Regression Analysis, Fifth Edition by Montgomery, Peck, and Vining. ISBN: 978-0-470-54281-1

An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani

The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie, Tibshirani, and Friedman

Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff Ullman

• This course requires the use of the following statistical and typesetting software:

Anaconda

Jupyterlab

• Additional course material and reading assignments will be provided via instructor notes and recent journal articles.

Grading

• Attendance: 10%

• Quizzes: 10%

• Assignments: 50%

• Midterm Exam: 15%

• Final Exam: 15%

Additional Information

• Course materials will be available on the class webpage.

• Pre-class assignments (reading and coding) are required to prepare for the lectures. Failure to complete these may result in a lower participation grade.

Contact

For more information or inquiries about the course, please contact Prof. Hyunglok Kim at hyunglokkim@gist.ac.kr.