Debiased score test for goodness of fit of an mgcv::gam fit.
Arguments
- object
Fitted
mgcv::gamobject.- hunt.style
Hunting algorithm with the following options.
'optimal': optimal hunting (default). Seehunt_optimal.'wls': a simpler hunting using weighted least squares, which can be less powerful. Seehunt_wls.'vanilla': a basic hunting; not recommended unless unable to fit an alternative model with weighted least squares. Seehunt_vanilla.
- hunt.method
Built-in method for hunting. Currently available:
'grf': regression forest from packagegrf.
When this is set to any other value, arguments
hunt_fun,arg.hunt_funandpredict_fun_altare used to specify a customized hunting method.- hunt_fun
Default
NULL. Whenhunt.methodis not set to a built-in method, this is a customized function for hunting. Whenhunt.styleis'optimal'or'wls', this function must have signaturehunt_fun(y, X, w, ...)that returns a fitted alternative model \(\hat{g} \in \mathcal{G}\) via weighted least squares, i.e., by minimizing \(\sum_i w_i (y_i - g(x_i))^2\); otherwise, for'vanilla'hunting, this function must have signaturehunt_fun(y, X, ...)that returns an alternative model fitted in any fashion. The returned objectgmust supportpredict_fun_alt(g, X)for evaluation.- trim.outlier.hunt
If
TRUE(default), extreme values produced by the hunted function will be trimmed using Tukey's IQR rule.- X.cols.exclude
Columns in
stats::model.matrix(object)to be excluded when hunting for alternative signal. DefaultNULL.- splits
Numeric vector of length 2 or 3 giving the relative sizes of the sample splits; rescaled internally to sum to one. Default is
c(0.5, 0.5), which splits data into two halves for hunt and test respectively. Though typically unnecessary in practice, one can also specify a 3-way split for hunt, debiasing and test respectively.- arg.hunt_fun
Extra arguments (default
NULL) passed to the customizedhunt.fun.- predict_fun_alt
When a customized
hunt.funis used, this is a function with signaturepredict_fun_alt(fit, X)returning a numeric vector of predictions from a fitted alternative model produced byhunt_fun().- verbose
Default
FALSE; information is printed if set toTRUE.- ...
Unused; present for S3 generic/method consistency.
Details
Only the numeric predictors appearing in stats::model.frame(object)
are exposed to the hunt; X.cols.exclude indexes into these
predictor variables (not basis columns). Factor-by smooths and other
non-numeric predictors are not currently supported. Formulas using
offset() terms, a weights argument, or a multi-column
response (e.g. cbind(succ, fail) ~ ...) are also not supported.
Examples
set.seed(42)
dat <- mgcv::gamSim(eg=1, n=400, dist="normal", scale=2, verbose = FALSE)
dat.0 <- dat[,1:5]
# well-specified
fit.0 <- mgcv::gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat.0)
test.0 <- gof_test(fit.0)
# f3=0, also well-specified
fit.1 <- mgcv::gam(y~s(x0)+s(x1)+s(x2),data=dat.0)
test.1 <- gof_test(fit.1)
plot(test.1)
# misspecified
dat.1 <- dat.0
dat.1$y <- dat.1$y * dat$f0
fit.2 <- mgcv::gam(y~s(x0)+s(x1)+s(x2)+s(x3), data=dat.1)
test.2 <- gof_test(fit.2)
plot(test.2)
