Computes functional GLM model between functional covariates \((X^1(t_1),\cdots,X^{q}(t_q))\) and non functional covariates \((Z^1,...,Z^p)\) with a scalar response \(Y\).
Arguments
- data
List that containing the variables in the model. "df" element is a data.frame containing the response and scalar covariates (numeric and factors variables are allowed). Functional covariates of class
fdata
orfd
are included as named components in thedata
list.- y
Caracter string with the name of the scalar response variable.
- include
vector with the name of variables to use. By default
"all"
, all variables are used.- exclude
vector with the name of variables to not use. By default
"none"
, no variable is deleted.- family
a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See
family
for details of family functions.)- weights
weights
- basis.x
Basis parameter options
list
(recomended) List of basis for functional covariates, see same argument infregre.glm
. By default, the function uses a basis of 3 PC to represent each functional covariate.vector
(by default) Vector with two parameters:Type of basis. By default
basis.x[1]="pc"
, principal component basis is used for each functional covariate included in the model. Other options"pls"
and"bspline"
.Maximum number of basis elements
numbasis
to be used. By default,basis.x[2]=3
.
- numbasis.opt
Logical, if
FALSE
by default, for each functional covariate included in the model, the function uses all basis elements. Otherwise, the function selects the significant coefficients.- dcor.min
Threshold for a variable to be entered into the model. X is discarded if the distance correlation \(R(X,e)< dcor.min\) (e is the residual of previous steps).
- alpha
Alpha value for testing the independence among covariate X and residual e in previous steps. By default is
0.05
.- par.model
Model parameters.
- xydist
List with the inner distance matrices of each variable (all potential covariates and the response).
- trace
Interactive Tracing and Debugging of Call.
Value
Return an object corresponding to the estimated additive mdoel using
the selected variables (ame output as thefregre.glm
function) and the following elements:
gof
, the goodness of fit for each step of VS algorithm.i.predictor
,vector
with 1 if the variable is selected, 0 otherwise.ipredictor
,vector
with the name of selected variables (in order of selection)dcor
, the value of distance correlation for each potential covariate and the residual of the model in each step.
Details
This function is an extension of the functional generalized spectral additive
regression models: fregre.glm
where the \(E[Y|X,Z]\) is related to the
linear prediction \(\eta\) via a link function \(g(\cdot)\).
$$E[Y|X,Z]=\eta=g^{-1}(\alpha+\sum_{j=1}^{p}\beta_{j}Z^{j}+\sum_{k=1}^{q}\frac{1}{\sqrt{T_k}}\int_{T_k}{X^{k}(t)\beta_{k}(t)dt})$$
where \(Z=\left[ Z^1,\cdots,Z^p \right]\) are the non functional covariates and \(X(t)=\left[ X^{1}(t_1),\cdots,X^{q}(t_q) \right]\) are the functional ones.
Note
If the formula only contains a non functional explanatory variables (multivariate covariates),
the function compute a standard glm
procedure.
References
Febrero-Bande, M., Gonz\'alez-Manteiga, W. and Oviedo de la Fuente, M. Variable selection in functional additive regression models, (2018). Computational Statistics, 1-19. DOI: doi:10.1007/s00180-018-0844-5
See also
See Also as: predict.fregre.glm
and summary.glm
.
Alternative methods: fregre.glm
, fregre.glm
and fregre.gsam.vs
.
Author
Manuel Febrero-Bande, Manuel Oviedo-de la Fuente manuel.oviedo@udc.es
Examples
if (FALSE) { # \dontrun{
data(tecator)
x=tecator$absorp.fdata
x1 <- fdata.deriv(x)
x2 <- fdata.deriv(x,nderiv=2)
y=tecator$y$Fat
xcat0 <- cut(rnorm(length(y)),4)
xcat1 <- cut(tecator$y$Protein,4)
xcat2 <- cut(tecator$y$Water,4)
ind <- 1:165
dat <- data.frame("Fat"=y, x1$data, xcat1, xcat2)
ldat <- ldata("df"=dat[ind,],"x"=x[ind,],"x1"=x1[ind,],"x2"=x2[ind,])
# 3 functionals (x,x1,x2), 3 factors (xcat0, xcat1, xcat2)
# and 100 scalars (impact poitns of x1)
# Time consuming
res.glm0 <- fregre.glm.vs(data=ldat,y="Fat",numbasis.opt=T) # All the covariates
summary(res.glm0)
res.glm0$ipredictors
res.glm0$i.predictor
res.glm1 <- fregre.glm.vs(data=ldat,y="Fat") # All the covariates
summary(res.glm1)
res.glm1$ipredictors
covar <- c("xcat0","xcat1","xcat2","x","x1","x2")
res.glm2 <- fregre.glm.vs(data=ldat, y="Fat", include=covar)
summary(res.glm2)
res.glm2$ipredictors
res.glm2$i.predictor
res.glm3 <- fregre.glm.vs(data=ldat,y="Fat",
basis.x=c("type.basis"="pc","numbasis"=2))
summary(res.glm3)
res.glm3$ipredictors
res.glm4 <- fregre.glm.vs(data=ldat,y="Fat",include=covar,
basis.x=c("type.basis"="pc","numbasis"=5),numbasis.opt=T)
summary(res.glm4)
res.glm4$ipredictors
lpc <- list("x"=create.pc.basis(ldat$x,1:4)
,"x1"=create.pc.basis(ldat$x1,1:3)
,"x2"=create.pc.basis(ldat$x2,1:4))
res.glm5 <- fregre.glm.vs(data=ldat,y="Fat",basis.x=lpc)
summary(res.glm5)
res.glm5 <- fregre.glm.vs(data=ldat,y="Fat",basis.x=lpc,numbasis.opt=T)
summary(res.glm5)
bsp <- create.fourier.basis(ldat$x$rangeval,7)
lbsp <- list("x"=bsp,"x1"=bsp,"x2"=bsp)
res.glm6 <- fregre.glm.vs(data=ldat,y="Fat",basis.x=lbsp)
summary(res.glm6)
# Prediction like fregre.glm()
newldat <- ldata("df"=dat[-ind,],"x"=x[-ind,],"x1"=x1[-ind,],
"x2"=x2[-ind,])
pred.glm1 <- predict(res.glm1,newldat)
pred.glm2 <- predict(res.glm2,newldat)
pred.glm3 <- predict(res.glm3,newldat)
pred.glm4 <- predict(res.glm4,newldat)
pred.glm5 <- predict(res.glm5,newldat)
pred.glm6 <- predict(res.glm6,newldat)
plot(dat[-ind,"Fat"],pred.glm1)
points(dat[-ind,"Fat"],pred.glm2,col=2)
points(dat[-ind,"Fat"],pred.glm3,col=3)
points(dat[-ind,"Fat"],pred.glm4,col=4)
points(dat[-ind,"Fat"],pred.glm5,col=5)
points(dat[-ind,"Fat"],pred.glm6,col=6)
pred2meas(newldat$df$Fat,pred.glm1)
pred2meas(newldat$df$Fat,pred.glm2)
pred2meas(newldat$df$Fat,pred.glm3)
pred2meas(newldat$df$Fat,pred.glm4)
pred2meas(newldat$df$Fat,pred.glm5)
pred2meas(newldat$df$Fat,pred.glm6)
} # }