If \(p\text{-value} < \alpha\), “reject” the null hypothesis in favor of the alternative and say that the effect is “statistically significant.”
Otherwise, do not reject the null.
The goal is to strike a balance between Type I and Type II errors.
Error Types
Null Hypothesis Is
True
False
Decision About Null Hypothesis
Don’t reject
True negative (probability \(1-\alpha\))
Type II error (false negative, probability \(\beta\))
Reject
Type I Error (false positive, probability \(\alpha\))
True positive (probability \(1-\beta\))
But What Is Statistical Significance?
But, this doesn’t mean:
That the null is “wrong”;
That the alternative is a better descriptor of the data-generating process;
That the effect sized of the hypothesized mechanism is “significant”.
What a \(p\)-value is Not
Probability that the null hypothesis is true;
Probability that the effect was produced by chance alone;
An indication of the effect size.
How Might We Do Better?
Consideration of multiple plausible (possibly more nuanced) hypotheses.
Assessment/quantification of evidence consistent with different hypotheses.
Insight into the effect size.
Model Assessment
Fundamental Data Analysis Challenge
Goal (often): Explain data and/or make predictions about unobserved data.
Challenges: Environmental systems are:
high-dimensional
multi-scale
nonlinear
subject to many uncertainties
Multiplicities of Models
In general, we are in an \(\mathcal{M}\)-open setting: no model is the “true” data-generating model, so we want to pick a model which performs well enough for the intended purpose.
The contrast to this is \(\mathcal{M}\)-closed, in which one of the models under consideration is the “true” data-generating model, and we would like to recover it.
What Is Any Statistical Test Doing?
If we think about what a test like Mann-Kendall is doing:
Assume the null hypothesis \(H_0\);
Obtain the sampling distribution of a test statistic \(S\) which captures the property of interest under \(H_0\);
Compute the test statistic \(\hat{S}\) on the data.
Calculate the probability of \(S\) more extreme than \(\hat{S}\) (the \(p\)-value).
None of this requires a NHST framework!
Simulation for Statistical Testing
Instead, if we have a model which permits simulation:
Calibrate models under different assumptions (e.g. stationarity vs. nonstationary based on different covariates);
Simulate realizations from those models;
Compute the distribution of the relevant statistic \(S\) from these realizations;
Assess which distribution is most consistent with the observed quantity.
Model Assessment Criteria
How do we assess models?
Two general categories:
How well do we explain the data?
How well do we predict new data?
Explanatory Criteria
Generally based on the error (RMSE, MAE) or probability of the data \(p(y | M)\).
# load SF tide gauge data# read in data and get annual maximafunctionload_data(fname) date_format =DateFormat("yyyy-mm-dd HH:MM:SS")# This uses the DataFramesMeta.jl package, which makes it easy to string together commands to load and process data df =@chain fname begin CSV.read(DataFrame; header=false)rename("Column1"=>"year", "Column2"=>"month", "Column3"=>"day", "Column4"=>"hour", "Column5"=>"gauge")# need to reformat the decimal date in the data file@transform:datetime =DateTime.(:year, :month, :day, :hour)# replace -99999 with missing@transform:gauge =ifelse.(abs.(:gauge) .>=9999, missing, :gauge)select(:datetime, :gauge)endreturn dfenddat =load_data("data/surge/h551.csv")# detrend the data to remove the effects of sea-level rise and seasonal dynamicsma_length =366ma_offset =Int(floor(ma_length/2))moving_average(series,n) = [mean(@view series[i-n:i+n]) for i in n+1:length(series)-n]dat_ma =DataFrame(datetime=dat.datetime[ma_offset+1:end-ma_offset], residual=dat.gauge[ma_offset+1:end-ma_offset] .-moving_average(dat.gauge, ma_offset))# group data by year and compute the annual maximadat_ma =dropmissing(dat_ma) # drop missing datadat_annmax =combine(dat_ma -> dat_ma[argmax(dat_ma.residual), :], groupby(DataFrames.transform(dat_ma, :datetime =>x->year.(x)), :datetime_function))delete!(dat_annmax, nrow(dat_annmax)) # delete 2023; haven't seen much of that year yetrename!(dat_annmax, :datetime_function =>:Year)select!(dat_annmax, [:Year, :residual])# make plotsp1 =plot( dat_annmax.Year, dat_annmax.residual; xlabel="Year", ylabel="Annual Max Tide Level (m)", label=false, marker=:circle, markersize=5, tickfontsize=16, guidefontsize=18, left_margin=5mm, bottom_margin=5mm)
Figure 2: Annual maxima surge data from the San Francisco, CA tide gauge.