Bayesian methods refine Earth science predictions by applying new data, improving forecasts of weather and climate changes.
Bayesian statistics is an approach for analyzing big data that updates probabilities based on new information. Our lab uses the Bayesian approach to fully utilize big data in Earth science more effectively and efficiently. By starting with a prior belief and updating it with new data, we obtain a more accurate probability called the posterior belief. This continuous updating process is especially valuable when dealing with large and complex Earth science datasets. Through Bayesian statistics, our lab enhances predictions and decision-making in Earth science by harnessing the power of big data.
For example, the Bayesian approach offers numerous advantages, such as reducing the time required to train machine learning models, identifying optimal parameters in complex equations with uncertainties, and enhancing the performance of deep learning models, ultimately leading to more accurate and efficient solutions in various professional applications (see below).
What we aim to achieve
By employing the Bayesian approach, our objective is to enhance machine learning training performance and examine uncertainties inherent in machine learning models. This enables us to develop advanced flood and drought prediction models and gain insights into the complex physics governing land-atmosphere interactions. We achieve this by utilizing water balance equations and land surface models, which together form the foundation for a more comprehensive understanding of Earth's interconnected systems.
Data and analytic skills we use for this project
To achieve these goals, the following five analytic skills are essential:
- Statistical expertise: A strong foundation in probability and statistics, including Bayesian methods, is crucial for understanding uncertainties and making informed decisions based on data.
- Machine learning proficiency: Familiarity with various machine learning techniques, such as supervised and unsupervised learning, as well as knowledge of deep learning algorithms, is necessary for developing accurate prediction models.
- Data processing and management: The ability to preprocess, clean, and manage large datasets is essential for efficient and effective analysis, particularly when working with big data in Earth science.
- Domain knowledge: A deep understanding of Earth science concepts, including hydrology, meteorology, and land-atmosphere interactions, is vital for interpreting results and making meaningful connections between the data and real-world phenomena.
- Programming and software skills: Proficiency in programming languages, such as Python or R, and familiarity with relevant software tools, like TensorFlow or PyTorch for deep learning, is necessary for implementing and automating the various stages of the analytical process.
Our supportive academic environment is here to help you learn and develop the necessary skills along the way.
If you are interested in any of the following research areas, please do not hesitate to contact me!
- Streamflow prediction with satellite data: Develop a Bayesian machine learning model that integrates remote sensing data, such as precipitation estimates and soil moisture, with in-situ measurements to improve streamflow predictions, supporting water resource management and flood forecasting.
- Evapotranspiration estimation: Apply Bayesian machine learning techniques to combine remote sensing data, including land surface temperature and vegetation indices, with meteorological data to estimate evapotranspiration rates more accurately, aiding in agricultural water management and climate studies.
- Groundwater storage estimation: Utilize Bayesian machine learning approaches to analyze remote sensing data, such as GRACE satellite measurements, alongside in-situ data and hydrogeological models to better estimate groundwater storage changes, guiding sustainable groundwater management practices.
- Assessing water quality using satellite imagery: Develop a Bayesian machine learning framework to analyze multi-spectral satellite imagery for estimating water quality parameters, such as turbidity, chlorophyll-a concentration, and dissolved organic matter, enabling large-scale water quality monitoring and informing water treatment strategies.