Goodness-of fit test for the functional linear model using random projections
Source:R/rp.flm.test.R
rp.flm.test.Rd
Tests the composite null hypothesis of a Functional Linear Model with scalar response (FLM), $$H_0:\,Y=\langle X,\beta\rangle+\epsilon\quad\mathrm{vs}\quad H_1:\,Y\neq\langle X,\beta\rangle+\epsilon.$$ If \(\beta=\beta_0\) is provided, then the simple hypothesis \(H_0:\,Y=\langle X,\beta_0\rangle+\epsilon\) is tested. The way of testing the null hypothesis is via a norm (Cramer-von Mises or Kolmogorov-Smirnov) in the empirical process indexed by the projections.
No NA's are allowed neither in the functional covariate nor in the scalar response.
Usage
rp.flm.test(
X.fdata,
Y,
beta0.fdata = NULL,
B = 1000,
n.proj = 10,
est.method = "pc",
p = NULL,
p.criterion = "SICc",
pmax = 20,
type.basis = "bspline",
projs = 0.95,
verbose = TRUE,
same.rwild = FALSE,
seed = NULL,
...
)
Arguments
- X.fdata
functional observations in the class
fdata
.- Y
scalar responses for the FLM. Must be a vector with the same number of elements as functions are in
X.fdata
.- beta0.fdata
functional parameter for the simple null hypothesis, in the
fdata
class. Theargvals
andrangeval
arguments ofbeta0.fdata
must be the same ofX.fdata
. Ifbeta0.fdata=NULL
(default), the function will test for the composite null hypothesis.- B
number of bootstrap replicates to calibrate the distribution of the test statistic.
- n.proj
vector with the number of projections to consider.
- est.method
Estimation method for \(\beta\), only used in the composite case. There are three methods:
"pc"
: ifp
is given, then \(\beta\) is estimated byfregre.pc
. Otherwise,p
is chosen usingfregre.pc.cv
and thep.criterion
criterion."pls"
: ifp
is given, \(\beta\) is estimated byfregre.pls
. Otherwise,p
is chosen usingfregre.pls.cv
and thep.criterion
criterion."basis"
: ifp
is given, \(\beta\) is estimated byfregre.basis
. Otherwise,p
is chosen usingfregre.basis.cv
and thep.criterion
criterion. Both infregre.basis
andfregre.basis.cv
, the same basis forbasis.x
andbasis.b
is considered.
- p
number of elements for the basis representation of
beta0.fdata
andX.fdata
with theest.method
(only composite hypothesis). If not supplied, it is estimated from the data.- p.criterion
for
est.method
equal to"pc"
or"pls"
, either"SIC"
,"SICc"
or one of the criterions described infregre.pc.cv
. For"basis"
a value fortype.CV
infregre.basis.cv
such asGCV.S
.- pmax
maximum size of the basis expansion to consider in when using
p.criterion
.- type.basis
type of basis if
est.method = "basis"
.- projs
a
fdata
object containing the random directions employed to projectX.fdata
. If numeric, the convenient value forncomp
inrdir.pc
.- verbose
whether to show or not information about the testing progress.
- same.rwild
wether to employ the same wild bootstrap residuals for different projections or not.
- seed
seed to be employed with
set.seed
for initializing random projections.- ...
further arguments passed to create.basis (not
rangeval
that is taken as therangeval
ofX.fdata
).
Value
An object with class "htest"
whose underlying structure is a
list containing the following components:
p.values.fdr
: a matrix of sizec(n.proj, 2)
, containing in each row the FDR p-values of the CvM and KS tests up to that projection.proj.statistics
: a matrix of sizec(max(n.proj), 2)
with the value of the test statistic on each projection.boot.proj.statistics
: an array of sizec(max(n.proj), 2, B)
with the values of the bootstrap test statistics for each projection.proj.p.values
: a matrix of sizec(max(n.proj), 2)
.method
: information about the test performed and the kind of estimation performed.B
: number of bootstrap replicates used.n.proj
: number of projections specified.projs
: random directions employed to projectX.fdata
.type.basis
: type of basis forest.method = "basis"
.beta.est
: estimated functional parameter \(\hat \beta\) in the composite hypothesis. For the simple hypothesis,beta0.fdata
.p
: number of basis elements considered for estimation of \(\beta\).p.criterion
: criterion employed for selectingp
.data.name
: the character string "Y = <X, b> + e".
References
Cuesta-Albertos, J.A., Garcia-Portugues, E., Febrero-Bande, M. and Gonzalez-Manteiga, W. (2017). Goodness-of-fit tests for the functional linear model based on randomly projected empirical processes. arXiv:1701.08363. https://arxiv.org/abs/1701.08363
Garcia-Portugues, E., Gonzalez-Manteiga, W. and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3), 761–778. doi:10.1080/10618600.2013.812519
Author
Eduardo Garcia-Portugues (edgarcia@est-econ.uc3m.es) and
Manuel Febrero-Bande (manuel.febrero@usc.es).
Examples
if (FALSE) { # \dontrun{
# Simulated example
set.seed(345678)
t <- seq(0, 1, l = 101)
n <- 100
X <- r.ou(n = n, t = t, alpha = 2, sigma = 0.5)
beta0 <- fdata(mdata = cos(2 * pi * t) - (t - 0.5)^2, argvals = t,
rangeval = c(0,1))
Y <- inprod.fdata(X, beta0) + rnorm(n, sd = 0.1)
# Test all cases
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls")
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis",
p.criterion = fda.usc.devel::GCV.S)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pc", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "pls", p = 5)
rp.flm.test(X.fdata = X, Y = Y, est.method = "basis", p = 5)
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)
# Composite hypothesis: do not reject FLM
rp.test <- rp.flm.test(X.fdata = X, Y = Y, est.method = "pc")
rp.test$p.values.fdr
pcvm.test <- flm.test(X.fdata = X, Y = Y, est.method = "pc", B = 1e3,
plot.it = FALSE)
pcvm.test
# Estimation of beta
par(mfrow = c(1, 3))
plot(X, main = "X")
plot(beta0, main = "beta")
lines(rp.test$beta.est, col = 2)
lines(pcvm.test$beta.est, col = 3)
plot(density(Y), main = "Density of Y", xlab = "Y", ylab = "Density")
rug(Y)
# Simple hypothesis: do not reject beta = beta0
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0, B = 1e3, plot.it = FALSE)
# Simple hypothesis: reject beta = beta0^2
rp.flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2)$p.values.fdr
flm.test(X.fdata = X, Y = Y, beta0.fdata = beta0^2, B = 1e3, plot.it = FALSE)
# Tecator dataset
# Load data
data(tecator)
absorp <- tecator$absorp.fdata
ind <- 1:129 # or ind <- 1:215
x <- absorp[ind, ]
y <- tecator$y$Fat[ind]
# Composite hypothesis
rp.tecat <- rp.flm.test(X.fdata = x, Y = y, est.method = "pc")
pcvm.tecat <- flm.test(X.fdata = x, Y = y, est.method = "pc", B = 1e3,
plot.it = FALSE)
rp.tecat$p.values.fdr[c(5, 10), ]
pcvm.tecat
# Simple hypothesis
zero <- fdata(mdata = rep(0, length(x$argvals)), argvals = x$argvals,
rangeval = x$rangeval)
rp.flm.test(X.fdata = x, Y = y, beta0.fdata = zero)
flm.test(X.fdata = x, Y = y, beta0.fdata = zero, B = 1e3)
# With derivatives
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 1), Y = y, est.method = "pc")
rp.tecat$p.values.fdr
rp.tecat <- rp.flm.test(X.fdata = fdata.deriv(x, 2), Y = y, est.method = "pc")
rp.tecat$p.values.fdr
# AEMET dataset
# Load data
data(aemet)
wind.speed <- apply(aemet$wind.speed$data, 1, mean)
temp <- aemet$temp
# Remove the 5% of the curves with less depth (i.e. 4 curves)
par(mfrow = c(1, 1))
res.FM <- depth.FM(temp, draw = TRUE)
qu <- quantile(res.FM$dep, prob = 0.05)
l <- which(res.FM$dep <= qu)
lines(aemet$temp[l], col = 3)
# Data without outliers
wind.speed <- wind.speed[-l]
temp <- temp[-l]
# Composite hypothesis
rp.aemet <- rp.flm.test(X.fdata = temp, Y = wind.speed, est.method = "pc")
pcvm.aemet <- flm.test(X.fdata = temp, Y = wind.speed, B = 1e3,
est.method = "pc", plot.it = FALSE)
rp.aemet$p.values.fdr
apply(rp.aemet$p.values.fdr, 2, range)
pcvm.aemet
# Simple hypothesis
zero <- fdata(mdata = rep(0, length(temp$argvals)), argvals = temp$argvals,
rangeval = temp$rangeval)
flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero, B = 1e3,
plot.it = FALSE)
rp.flm.test(X.fdata = temp, Y = wind.speed, beta0.fdata = zero)
} # }