Emulation Wrap-Up and Class Review


Lecture 22

May 1, 2024

Review of Last Class

Benefits of Model Simplicity

  • More thorough representation of uncertainties
  • Can focus on “important” characteristics for problem at hand
  • Potential increase in generalizability

Computational Complexity

Source: Helgeson et al. (2021)

Downsides of Model Simplicity

  • Potential loss of salience
  • May miss important dynamics (creating bias)
  • Parameter/dynamical compensation can result in loss of interpretability

Simplicity Tradeoffs

Simple models can be epistemically and practically valuable.

But:

Need to carefully select which processes/parameters are included in the simplified representation, and at what resolution.

Approximating Complex Models

Challenge: How do we simplify complex models to keep key dynamics but reduce computational expense?

Approximate (or emulate) the model response surface.

  1. Evaluate original model at an ensemble of points (design of experiment)
  2. Calibrate emulator against those points.
  3. Use emulator for UQ with MCMC or other methods.

Design of Experiments

Important to strike a balance betwee:

  • Computational expense for model evaluation
  • Dense/expansive enough sample for training

Emulation Methods

Overview of Methods

Any “simple”, fast to evaluate model structure can be used for emulation:

  • Gaussian processes;
  • Artificial neural networks (or other ML methods);
  • Polynomial chaos expansions;
  • Radial basis functions;
  • Reduced-form models (think SLR semi-empirical model)

How To Choose An Emulation Method?

  • Dimensionality of problem
  • Interpretability vs. response surface complexity
  • Needed number of training evaluations
  • Hyperparameter tuning

Selecting Parameters For Simplification

Simplification often involves down-selecting parameters of interest.

This could be based on:

  1. Scientific relevance;
  2. Factor importance

Factor Prioritization

Modes of Sensitivity Analysis

Source: Reed et al. (2022)

How to Rank Factors?

Sensitivity Analysis:

  • All-At-Once vs. One-at-a-Time
  • Local vs. Global

Good overview with some notebooks: Reed et al. (2022)

Types of Sensitivity Analysis

Types of Sensitivity Analysis

Source: Reed et al. (2022)

Design of Experiments

Design of Experiments

Source: Reed et al. (2022)

Class Review

Why Does Data Analysis Matter?

  • Scientific insight;
  • Decision-making;
  • Understanding uncertainty

The Ideal

XKCD #2400

Source: XKCD 2400

Modes of Data Analysis

What Did We Do?

  1. Probability Models for Data
  2. Bayesian and Frequentist Statistics
  3. Monte Carlo/Bootstrap Simulation
  4. Assessing Model-Data Fit and Hypothesis Testing

What Are Some Next Directions?

  • More specific models/statistical methods (time series, spatial statistics, hidden Markov models, model-based clustering, etc)
  • Machine learning and clustering
  • Dimension reduction (principal components, singular value decomposition, etc)

Key Takeaways and Upcoming Schedule

Upcoming Schedule

Friday: HW4 due

Next Monday: Project Presentations, email slides by Saturday.

References

References

Helgeson, C., Srikrishnan, V., Keller, K., & Tuana, N. (2021). Why simpler computer simulation models can be epistemically better for informing decisions. Philos. Sci., 88(2), 213–233. https://doi.org/10.1086/711501
Reed, P. M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C. R., Srikrishnan, V., et al. (2022). Addressing uncertainty in multisector dynamics research. Zenodo. Retrieved from https://immm-sfa.github.io/msd_uncertainty_ebook/