Goodness-of fit test for the functional linear model using random projections

Tests the composite null hypothesis of a Functional Linear Model with scalar response (FLM), $$H_0:\,Y=\langle X,\beta\rangle+\epsilon\quad\mathrm{vs}\quad H_1:\,Y\neq\langle X,\beta\rangle+\epsilon.$$ If $\beta=\beta_0$ is provided, then the simple hypothesis $H_0:\,Y=\langle X,\beta_0\rangle+\epsilon$ is tested. The way of testing the null hypothesis is via a norm (Cramer-von Mises or Kolmogorov-Smirnov) in the empirical process indexed by the projections.

No NA's are allowed neither in the functional covariate nor in the scalar response.

Usage

rp.flm.test(
  X.fdata,
  Y,
  beta0.fdata = NULL,
  B = 1000,
  n.proj = 10,
  est.method = "pc",
  p = NULL,
  p.criterion = "SICc",
  pmax = 20,
  type.basis = "bspline",
  projs = 0.95,
  verbose = TRUE,
  same.rwild = FALSE,
  seed = NULL,
  ...
)

Arguments

X.fdata

functional observations in the class fdata.

Y

scalar responses for the FLM. Must be a vector with the same number of elements as functions are in X.fdata.

beta0.fdata

functional parameter for the simple null hypothesis, in the fdata class. The argvals and rangeval arguments of beta0.fdata must be the same of X.fdata. If beta0.fdata=NULL (default), the function will test for the composite null hypothesis.

B

number of bootstrap replicates to calibrate the distribution of the test statistic.

n.proj

vector with the number of projections to consider.

est.method

Estimation method for $\beta$, only used in the composite case. There are three methods:

"pc": if p is given, then $\beta$ is estimated by fregre.pc. Otherwise, p is chosen using fregre.pc.cv and the p.criterion criterion.
"pls": if p is given, $\beta$ is estimated by fregre.pls. Otherwise, p is chosen using fregre.pls.cv and the p.criterion criterion.
"basis": if p is given, $\beta$ is estimated by fregre.basis. Otherwise, p is chosen using fregre.basis.cv and the p.criterion criterion. Both in fregre.basis and fregre.basis.cv, the same basis for basis.x and basis.b is considered.

p

number of elements for the basis representation of beta0.fdata and X.fdata with the est.method (only composite hypothesis). If not supplied, it is estimated from the data.

p.criterion

for est.method equal to "pc" or "pls", either "SIC", "SICc" or one of the criterions described in fregre.pc.cv. For "basis" a value for type.CV in fregre.basis.cv such as GCV.S.

pmax

maximum size of the basis expansion to consider in when using p.criterion.

type.basis

type of basis if est.method = "basis".

projs

a fdata object containing the random directions employed to project X.fdata. If numeric, the convenient value for ncomp in rdir.pc.

verbose

whether to show or not information about the testing progress.

same.rwild

wether to employ the same wild bootstrap residuals for different projections or not.

seed

seed to be employed with set.seed for initializing random projections.

...

further arguments passed to create.basis (not rangeval that is taken as the rangeval of X.fdata).

Value

An object with class "htest" whose underlying structure is a list containing the following components:

p.values.fdr: a matrix of size c(n.proj, 2), containing in each row the FDR p-values of the CvM and KS tests up to that projection.
proj.statistics: a matrix of size c(max(n.proj), 2) with the value of the test statistic on each projection.
boot.proj.statistics: an array of size c(max(n.proj), 2, B) with the values of the bootstrap test statistics for each projection.
proj.p.values: a matrix of size c(max(n.proj), 2).
method: information about the test performed and the kind of estimation performed.
B: number of bootstrap replicates used.
n.proj: number of projections specified.
projs: random directions employed to project X.fdata.
type.basis: type of basis for est.method = "basis".
beta.est: estimated functional parameter $\hat \beta$ in the composite hypothesis. For the simple hypothesis, beta0.fdata.
p: number of basis elements considered for estimation of $\beta$.
p.criterion: criterion employed for selecting p.
data.name: the character string "Y = <X, b> + e".

References

Cuesta-Albertos, J.A., Garcia-Portugues, E., Febrero-Bande, M. and Gonzalez-Manteiga, W. (2017). Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes. arXiv:1701.08363. https://arxiv.org/abs/1701.08363

Garcia-Portugues, E., Gonzalez-Manteiga, W. and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3), 761–778. doi:10.1080/10618600.2013.812519

Author

Eduardo Garcia-Portugues (edgarcia@est-econ.uc3m.es) and
Manuel Febrero-Bande (manuel.febrero@usc.es).

Examples

if (FALSE) { # \dontrun{
# Simulated example

set.seed(345678)
t <- seq(0, 1, l = 101)
n <- 100
X <- r.ou(n = n, t = t, alpha = 2, sigma = 0.5)
beta0 <- fdata(mdata = cos(2 * pi * t) - (t - 0.5)^2, argvals = t,
               rangeval = c(0,1))
Y <- inprod.fdata(X, beta0) + rnorm(n, sd = 0.1)

# Test all cases
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls")
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis", 
            p.criterion = fda.usc.devel::GCV.S)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis", p = 5)
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)

# Composite hypothesis: do not reject FLM
rp.test <- rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.test$p.values.fdr
pcvm.test <- flm.test(X.fdata = X, Y = Y, est.method = "pc", B = 1e3,
                      plot.it = FALSE)
pcvm.test

# Estimation of beta
par(mfrow = c(1, 3))
plot(X, main = "X")
plot(beta0, main = "beta")
lines(rp.test$beta.est, col = 2)
lines(pcvm.test$beta.est, col = 3)
plot(density(Y), main = "Density of Y", xlab = "Y", ylab = "Density")
rug(Y)

# Simple hypothesis: do not reject beta = beta0
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0, B = 1e3, plot.it = FALSE)

# Simple hypothesis: reject beta = beta0^2
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2, B = 1e3, plot.it = FALSE)

# Tecator dataset

# Load data
data(tecator)
absorp <- tecator$absorp.fdata
ind <- 1:129 # or ind <- 1:215
x <- absorp[ind, ]
y <- tecator$y$Fat[ind]

# Composite hypothesis
rp.tecat <- rp.flm.test(X.fdata = x, Y = y, est.method = "pc")
pcvm.tecat <- flm.test(X.fdata = x, Y = y, est.method = "pc", B = 1e3,
                       plot.it = FALSE)
rp.tecat$p.values.fdr[c(5, 10), ]
pcvm.tecat

# Simple hypothesis
zero <- fdata(mdata = rep(0, length(x$argvals)), argvals = x$argvals,
              rangeval = x$rangeval)
rp.flm.test(X.fdata = x, Y = y, beta0.fdata = zero)
flm.test(X.fdata = x, Y = y, beta0.fdata = zero, B = 1e3)

# With derivatives
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 1), Y = y, est.method = "pc")
rp.tecat$p.values.fdr
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 2), Y = y, est.method = "pc")
rp.tecat$p.values.fdr

# AEMET dataset

# Load data
data(aemet)
wind.speed <- apply(aemet$wind.speed$data, 1, mean)
temp <- aemet$temp

# Remove the 5% of the curves with less depth (i.e. 4 curves)
par(mfrow = c(1, 1))
res.FM <- depth.FM(temp, draw = TRUE)
qu <- quantile(res.FM$dep, prob = 0.05)
l <- which(res.FM$dep <= qu)
lines(aemet$temp[l], col = 3)

# Data without outliers
wind.speed <- wind.speed[-l]
temp <- temp[-l]

# Composite hypothesis
rp.aemet <- rp.flm.test(X.fdata = temp, Y = wind.speed, est.method = "pc")
pcvm.aemet <- flm.test(X.fdata = temp, Y = wind.speed, B = 1e3,
                       est.method = "pc", plot.it = FALSE)
rp.aemet$p.values.fdr
apply(rp.aemet$p.values.fdr, 2, range)
pcvm.aemet

# Simple hypothesis
zero <- fdata(mdata = rep(0, length(temp$argvals)), argvals = temp$argvals,
              rangeval = temp$rangeval)
flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero, B = 1e3,
         plot.it = FALSE)
rp.flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero)
} # }