Delsol, Ferraty and Vieu test for no functional-scalar interaction

The function dfv.test tests the null hypothesis of no interaction between a functional covariate and a scalar response in a general framework. The null hypothesis is $$H_0:\,m(X)=0,$$ where $m(\cdot)$ denotes the regression function of the functional variate $X$ over the centred scalar response $Y$ ($E[Y]=0$). The null hypothesis is tested by the smoothed integrated square error of the response (see Details).

Usage

dfv.statistic(
  X.fdata,
  Y,
  h = quantile(x = metric.lp(X.fdata), probs = c(0.05, 0.1, 0.15, 0.25, 0.5)),
  K = function(x) 2 * dnorm(abs(x)),
  weights = rep(1, dim(X.fdata$data)[1]),
  d = metric.lp,
  dist = NULL
)

dfv.test(
  X.fdata,
  Y,
  B = 5000,
  h = quantile(x = metric.lp(X.fdata), probs = c(0.05, 0.1, 0.15, 0.25, 0.5)),
  K = function(x) 2 * dnorm(abs(x)),
  weights = rep(1, dim(X.fdata$data)[1]),
  d = metric.lp,
  verbose = TRUE
)

Arguments

X.fdata: Functional covariate. The object must be in the class fdata.
Y: Scalar response. Must be a vector with the same number of elements as functions are in X.fdata.
h: Bandwidth parameter for the kernel smoothing. This is a crucial parameter that affects the power performance of the test. One possibility to choose it is considering the Cross-validatory bandwidth of the nonparametric functional regression, given by the function fregre.np (see Examples). Other possibility is to consider a grid of bandwidths. This is the default option, considering the grid given by the quantiles 0.05, 0.10, 0.15, 0.25 and 0.50 of the functional $L^2$ distances of the data.
K: Kernel function. If no specified it is taken to be the rescaled right part of the normal density.
weights: A vector of weights for the sample data. The default is the uniform weights rep(1,dim(X.fdata$data)[1]).
d: Semimetric to use in the kernel smoothers. By default is the $L^2$ distance given by metric.lp.
dist: Matrix of distances of the functional data, used to save time in the bootstrap calibration. If not given, the matrix is automatically computed using the semimetric d.
B: Number of bootstrap replicates to calibrate the distribution of the test statistic. B=5000 replicates are the recommended for carry out the test, although for exploratory analysis (not inferential), an acceptable less time-consuming option is B=500.
verbose: Either to show or not information about computing progress.

Value

The value of dfv.statistic is a vector of length length(h) with the values of the statistic for each bandwidth. The value of dfv.test is an object with class "htest" whose underlying structure is a list containing the following components:

statistic: The value of the Delsol, Ferraty and Vieu test statistic.
boot.statistics: A vector of length B with the values of the bootstrap test statistics.
p.value: The p-value of the test.
method: The character string "Delsol, Ferraty and Vieu test for no functional-scalar interaction".
B: The number of bootstrap replicates used.
h: Bandwidth parameters for the test.
K: Kernel function used.
weights: The weights considered.
d: Matrix of distances of the functional data.
data.name: The character string "Y=0+e".

Details

The Delsol, Ferraty and Vieu statistic is defined as $$T_n=\int\bigg(\sum_{i=1}^n(Y_i-m(X_i))K\bigg(\frac{d(X,X_i)}{h}\bigg)\bigg)^2\omega(X)dP_X(X)$$ and in the case of no interaction with centred scalar response (when $H_0:\,m(X)=0$ holds), its sample version is computed from $$T_n=\frac{1}{n}\sum_{j=1}^n\bigg(\sum_{i=1}^n Y_iK\bigg(\frac{d(X_j,X_i)}{h}\bigg)\bigg)^2\omega(X_j).$$ The sample version implemented here does not consider a splitting of the sample, as the authors comment in their paper. The statistic is computed by the function dfv.statistic and, before applying the test, the response $Y$ is centred. The distribution of the test statistic is approximated by a wild bootstrap on the residuals, using the golden section bootstrap.

Please note that if a grid of bandwidths is passed, a harmless warning message will prompt at the end of the test (it comes from returning several p-values in the htest class).

Note

No NA's are allowed neither in the functional covariate nor in the scalar response.

References

Delsol, L., Ferraty, F. and Vieu, P. (2011). Structural test in regression on functional variables. Journal of Multivariate Analysis, 102, 422-447. doi:10.1016/j.jmva.2010.10.003

Delsol, L. (2013). No effect tests in regression on functional variable and some applications to spectrometric studies. Computational Statistics, 28(4), 1775-1811. doi:10.1007/s00180-012-0378-1

Author

Eduardo Garcia-Portugues. Please, report bugs and suggestions to eduardo.garcia.portugues@uc3m.es

Examples

if (FALSE) { # \dontrun{
## Simulated example ##
X=rproc2fdata(n=50,t=seq(0,1,l=101),sigma="OU")

beta0=fdata(mdata=rep(0,length=101)+rnorm(101,sd=0.05),
argvals=seq(0,1,l=101),rangeval=c(0,1))
beta1=fdata(mdata=cos(2*pi*seq(0,1,l=101))-(seq(0,1,l=101)-0.5)^2+
rnorm(101,sd=0.05),argvals=seq(0,1,l=101),rangeval=c(0,1))

# Null hypothesis holds
Y0=drop(inprod.fdata(X,beta0)+rnorm(50,sd=0.1))

# Null hypothesis does not hold
Y1=drop(inprod.fdata(X,beta1)+rnorm(50,sd=0.1))

# We use the CV bandwidth given by fregre.np
# Do not reject H0
dfv.test(X,Y0,h=fregre.np(X,Y0)$h.opt,B=100)
# dfv.test(X,Y0,B=5000)

# Reject H0
dfv.test(X,Y1,B=100)
# dfv.test(X,Y1,B=5000)
} # }