Chapter 2 Functional Regression Model
Regression models are those techniques for modeling and analyzing the relationship between a dependent variable and one or more independent variables. When one of the variables have a functional nature, we have functional regression models.
This section is devoted to all the functional regression models where the response variable is scalar and at least, there is one functional covariate.
For illustration, we will use the Tecator dataset to predict the fat contents from The explanatory variables to introduce in the models are:p The curves of absorbance \(X(t)\) as functional data or one of its two first derivatives (\(X.d1,X.d2\)) and/or Water content as real variable.
library(fda.usc.devel)
data(tecator)
absorp<-tecator$absorp
ind<-sample(215,129) #ind = 1:129
tt = absorp[["argvals"]]
y = tecator[["y"]]$Fat[ind]
X = absorp[ind, ]
X.d1 = fdata.deriv(X, nbasis = 19, nderiv = 1)
X.d2 = fdata.deriv(X, nbasis = 19, nderiv = 2)
par(mfrow=c(2,2))
plot(X)
plot(X.d1)
plot(X.d2)
boxplot(y)
In the following sections, regression methods implemented –fda.usc– pacakge in the package are presented one by one and illustrated with examples for estimating the Fat content of the Tecator dataset.
2.1 Functional linear model (FLR) with basis representation
Supose that \(\mathcal{X} \in \mathcal{L}_{2}(T)\) and \(y \in \mathbb{R}\). Assume also that \(\mathbb{E}[\mathcal{X}(t)]=0, \forall t \in [0,T]\) and \(\mathbb{E}[y]=0\).
The FLM states that \[y= \left\langle \mathcal{X},\beta \right\rangle +\varepsilon=\int_{T}X(t)\beta(t)dt+\varepsilon\] where \(\beta \in \mathcal{L}_{2}(T)\) and \(\varepsilon\) is the errror term.
One way of estimating \(\beta\), it is representing the parametmer (and \(\mathcal{X}\)) in a \(\mathcal{L}_2\)-basis in the following way:
\[\beta(t)=\sum_k \beta_k \theta_k(t), \mathbf{X}(t)=\sum_k c_i \psi_k(t)\]
fregre.basis()
fucntion uses fixed basis: B–spline, Fourier, etc. Ramsay and Silverman (2005b), Cardot, Ferraty, and Sarda (1999))
The next code illustrates how to estimate the fat contents using a sample of absorbances curves.
rangett <- X$rangeval
basis1 = create.bspline.basis(rangeval = rangett, nbasis = 17)
basis2 = create.bspline.basis(rangeval = rangett, nbasis = 7)
res.basis0 = fregre.basis(X, y, basis.x = basis1, basis.b = basis2)
res.basis1 = fregre.basis(X.d1, y, basis.x = basis1, basis.b = basis2)
res.basis2 = fregre.basis(X.d2, y, basis.x = basis1, basis.b = basis2)
res.basis0$r2;res.basis1$r2;res.basis2$r2
## [1] 0.9394412
## [1] 0.937962
## [1] 0.9529404
## *** Summary Functional Data Regression with representation in Basis ***
##
## Call:
## fregre.basis(fdataobj = X.d2, y = y, basis.x = basis1, basis.b = basis2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.1083 -1.9775 -0.2428 1.8372 6.1884
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.782e+01 2.557e-01 69.678 < 2e-16 ***
## Spectrometriccurves.bspl4.1 -1.302e+04 3.586e+03 -3.631 0.000415 ***
## Spectrometriccurves.bspl4.2 9.425e+03 2.785e+03 3.384 0.000964 ***
## Spectrometriccurves.bspl4.3 -1.850e+03 1.456e+03 -1.270 0.206399
## Spectrometriccurves.bspl4.4 1.101e+03 1.112e+03 0.990 0.323953
## Spectrometriccurves.bspl4.5 -2.106e+03 1.147e+03 -1.835 0.068895 .
## Spectrometriccurves.bspl4.6 7.573e+03 1.859e+03 4.074 8.31e-05 ***
## Spectrometriccurves.bspl4.7 -7.804e+03 1.411e+03 -5.532 1.86e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.905 on 121 degrees of freedom
## Multiple R-squared: 0.9529, Adjusted R-squared: 0.9502
## F-statistic: 350 on 7 and 121 DF, p-value: < 2.2e-16
##
## -Names of possible atypical curves: No atypical curves
## -Names of possible influence curves: 140
## [1] "done"
## [1] "done"
## [1] "done"
The choice of the appropiate basis (and the number of basis elements) becomes now in a crucial step:
## *** Summary Functional Data Regression with representation in Basis ***
##
## Call:
## fregre.basis(fdataobj = X, y = y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.0856 -2.0114 -0.1727 2.2774 5.9429
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.8194 0.2651 67.216 <2e-16 ***
## Spectrometriccurves.bspl4.1 -175.0780 93.4300 -1.874 0.0634 .
## Spectrometriccurves.bspl4.2 206.9709 112.4617 1.840 0.0682 .
## Spectrometriccurves.bspl4.3 -105.0643 73.6345 -1.427 0.1563
## Spectrometriccurves.bspl4.4 8.3363 34.1866 0.244 0.8078
## Spectrometriccurves.bspl4.5 24.4211 19.6489 1.243 0.2164
## Spectrometriccurves.bspl4.6 -33.0608 17.4476 -1.895 0.0606 .
## Spectrometriccurves.bspl4.7 56.9342 29.0160 1.962 0.0521 .
## Spectrometriccurves.bspl4.8 -116.5058 67.9025 -1.716 0.0888 .
## Spectrometriccurves.bspl4.9 163.7199 111.2343 1.472 0.1437
## Spectrometriccurves.bspl4.10 -128.3787 96.4300 -1.331 0.1857
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.011 on 118 degrees of freedom
## Multiple R-squared: 0.9507, Adjusted R-squared: 0.9465
## F-statistic: 227.5 on 10 and 118 DF, p-value: < 2.2e-16
##
## -Names of possible atypical curves: 43
## -Names of possible influence curves: 140 43 6
- Functional Principal Components (FPC).(Cardot, Ferraty, and Sarda 1999),
fregre.pc()
x<-X
basis.pc0 = create.pc.basis(X,1:3)
res.pc1 = fregre.pc(X, y, basis.x = basis.pc)
summary(res.pc1)
## *** Summary Functional Data Regression with Principal Components ***
##
## Call:
## fregre.pc(fdataobj = X, y = y, basis.x = basis.pc)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.983 -3.756 0.563 4.855 13.991
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.81938 0.68448 26.034 < 2e-16 ***
## PC1 -0.98512 0.09248 -10.652 < 2e-16 ***
## PC2 3.86630 1.04836 3.688 0.000336 ***
## PC3 -21.05104 1.78577 -11.788 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.774 on 125 degrees of freedom
## Multiple R-squared: 0.6517, Adjusted R-squared: 0.6434
## F-statistic: 77.98 on 3 and 125 DF, p-value: < 2.2e-16
##
##
## -With 3 Principal Components is explained 99.48 %
## of the variability of explicative variables.
##
## -Variability for each principal components -PC- (%):
## PC1 PC2 PC3
## 98.46 0.75 0.26
## -Names of possible atypical curves: 14 13 188
## -Names of possible influence curves: 99 140 44 139 204
## Length Class Mode
## fregre.pc 19 fregre.fd list
## pc.opt 7 -none- numeric
## lambda.opt 1 -none- numeric
## PC.order 8 -none- numeric
## MSC.order 8 -none- numeric
2.2 FLM with functional and non functional covariates
\[E(y)=\alpha+\mathbf{Z}\beta+\sum_{q=1}^Q \left\langle \mathcal{X}^{q}(t),\beta_{q}(t)\right\rangle \]
where \(\left\{\mathcal{X}_q(t)\right\}_{q=1}^Q\) are function covariates and \(\mathbf{Z}=\left\{{Z_j}\right\}_{j=1}^J\) the non–functional covariates.
dataf = as.data.frame(tecator[["y"]][ind,]) # Fat, Protein, Water
basis.pc2 = create.pc.basis(X.d2,1:4)
basis.x = list(X = basis.pc0, X.d2 =basis.pc2)
f = Fat ~ X+X.d2
ldata = list(df = dataf, X=X,X.d2=X.d2)
res.lm1 = fregre.lm(f, ldata, basis.x = basis.x)
f = Fat ~ Water+X.d2
res.lm2 = fregre.lm(f, ldata, basis.x = basis.x)
##
## Call:
## lm(formula = pf, data = XX, x = TRUE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.5068 -1.8113 -0.0981 1.8422 8.0558
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.782e+01 2.509e-01 71.033 < 2e-16 ***
## X.PC1 -1.772e-01 9.719e-02 -1.823 0.070734 .
## X.PC2 -7.644e+00 2.481e+00 -3.080 0.002559 **
## X.PC3 -2.490e+01 7.211e+00 -3.452 0.000766 ***
## X.d2.PC1 2.743e+03 4.997e+02 5.490 2.25e-07 ***
## X.d2.PC2 7.573e+03 2.684e+03 2.822 0.005586 **
## X.d2.PC3 -3.463e+03 1.029e+03 -3.364 0.001030 **
## X.d2.PC4 -1.177e+03 3.133e+03 -0.376 0.707841
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.849 on 121 degrees of freedom
## Multiple R-squared: 0.9547, Adjusted R-squared: 0.9521
## F-statistic: 364.5 on 7 and 121 DF, p-value: < 2.2e-16
2.3 Other procedures
- Other procedures
- Partial Least Squares (FPLS).
fregre.pls()
, Preda and Saporta (2005) - Penalized versions and parameter selection:
fregre.pc.cv
,fregre.basis.cv
,fregre.np.cv
(Febrero-Bande and Oviedo de la Fuente 2012) - F-test for the FLM with scalar response:
flm.Ftest
,F-test
(Garcı́a-Portugués, González-Manteiga, and Febrero-Bande 2014) - Goodness-of-fit test for the FLM with scalar response:
flm.test
(Garcı́a-Portugués, González-Manteiga, and Febrero-Bande 2014) - Measures of influence in FLM with scalar response:
influence.fdata
,(Febrero-Bande, Galeano, and González-Manteiga 2010) - Beta parameter estimation by wild or smoothed bootstrap procedure:
fregre.bootstrap
- FLM with a functional response:
fregre.basis.fr
(Chiou et al. 2004)
2.4 Non Linear Model (Frédéric Ferraty and Vieu 2006)
Supose \((\mathcal{X},Y)\) are a pair of r.v. with \(y\in \mathbb{R}\) where \(\mathbb{E}\) is a semi-metric space. To predict the resonse \(Y\) with \(\mathcal{X}\), the estimation is:
\[m(\mathcal{X})=\mathbb{E}(Y|X=\mathcal{X})\], where the NW estimator is given by:
\[\hat{m}(\mathcal{X})=\frac{\sum_{i=1}^n Y_i{ K(d(\mathcal{X},X_i)/h)}}{\sum_{i=1}^n {K(d(\mathcal{X},X_i)/h)}}\]
where K is an asymmetric kernel function and h is the bandwidth parameter.
2.5 Semi Linear Model (Aneiros-Pérez and Vieu 2006)
Let \((\mathcal{X},\mathbf{Z},y)\) with \(y\in \mathbb{R}\) (response), \(\mathcal{X}\in \mathbb{E}\) (functional) and \(\mathbf{Z} \in \mathbb{R}^p\) (MV covariates).
\[y = Z + m(X) + \varepsilon\]
Arguments for fregre.np() and fregre.plm() function
- –Ker–: type of asymmetric kernel function, by default asymmetric normal kernel (cosine, epanechnicov, quadratic,….).
- –metric–: type of metric or semimetric.
–type.S–: type of smoothing matrix \(\mathbf{S}\):
S.NW
,S.LLR
,S.KNN
.
tecator<-list("df"=tecator$y,"absorp.fdata"=tecator$absorp.fdata)
X=tecator$absorp.fdata
y<-tecator$df$Fat
np<-fregre.np(X, y, metric = semimetric.deriv, nderiv = 1,type.S = S.KNN)
Again, it has also implemented the function fregre.np.cv
to estimate the smoothing parameter \(h\) by the validation criteria.
np<-fregre.np(X, y, metric = semimetric.deriv, nderiv = 1,type.S = S.KNN)
np.cv<-fregre.np.cv(X, y, metric = semimetric.deriv, nderiv = 1,type.S = S.KNN,h=c(3:9))
c(np$h.opt,np.cv$h.opt)
## [1] 12 3
## [1] 0.7438657 0.9343133
2.6 Generalized Linear Models (Müller and Stadtmüller 2005)
One natural extension of LM model is the generalized functional linear regression model (GFLM) which allows various types of the response. In the GLM framework it is generally assumed that \(y_i|X_i\) can be chosen within the set of distributions belonging to the exponential family.
In Generalized Functional Linear Model (FGLM), The scalar response \(y\)(belonging to a Exponential Family PDF) is estimated by functional \(\left\{\mathcal{X}_q(t)\right\}_{q=1}^Q\) and also non–functional \(\mathbf{Z}=\left\{{Z_j}\right\}_{j=1}^J\) covariates by:
\[E(y)=g^{-1}\left(\alpha+\mathbf{Z}\beta+\sum_{q=1}^Q \left\langle \mathcal{X}^{q}(t),\beta_{q}(t)\right\rangle\right) \] where \(g()\) is the inverse link function.
Example of logistic regression
In logistic regression, the probability, \(\pi_i\) , of the occurrence of an event, \(Y_i = 1\), rather than the event \(Y_i = 0\), conditional on a vector of covariates \(\mathcal{X}_i(t)\) is expressed as:
\[ p_i = \mathbb{P}[Y = 1|{X_i(t): t \in T }]=\frac{+exp\left\{\alpha+\int_{T}X_{i}(t)\beta(t)dt \right\}}{1+exp\left\{\alpha+\int_{T}X_{i}(t)\beta(t)dt \right\}}\ , i= 1,\ldots,n\]
with \(\epsilon\) are the independent errors with zero mean.
data(tecator)
names(tecator)[2]<-"df"
tecator$df$fat15<-ifelse(tecator$df$Fat<15,0,1)
tecator$absorp.d2=fdata.deriv(tecator$absorp.fdata,nderiv=2)
res.glm<-fregre.glm(fat15 ~ absorp.d2,data=tecator,family=binomial())
#summary(a)
yfit<-ifelse(res.glm$fitted.values<.5,0,1)
table(tecator$df$fat15,yfit)
## yfit
## 0 1
## 0 111 1
## 1 1 102
2.7 Generalized Functional Additive Model
- Generalized Functional Spectral Additive Linear Model (FGSAM), (Müller and Yao 2012)
\[E(y)=g^{-1}\left(\alpha+\sum_{j=1}^J f_{j}\left(\mathbf{Z}^{j}\right)+\sum_{q=1}^Q s_q\left(\mathcal{X}_{i}^{q}(t)\right)\right)\]
where \({f}(\cdot),{s}(\cdot)\) are the smoothed functions.
res.gsam<-fregre.gsam(fat15~ s(absorp.d2),data=tecator,family=binomial())
yfit<-ifelse(res.gsam$fitted<.5,0,1)
table(tecator$df$fat15,yfit)
## yfit
## 0 1
## 0 112 0
## 1 0 103
- Generalized Functional Kernel Additive Linear Model (FGKAM), (Febrero-Bande and González-Manteiga 2013)
\[E(y)=g^{-1}\left(\alpha+\sum_{q=1}^Q\mathcal{K}\left(\mathcal{X}^{q}_i(t)\right)\right)\] where \(\mathcal{K}(\cdot)\) is the kernel estimator.
2.8 Functional GLS model
See Oviedo de la Fuente et al. (2018) for more details about the below algorithm:
A. Jointly estimation (nlme package): Minimize for \((\beta,\theta)\) the GLS criteria, i.e,
\[\Psi(\beta,\theta)=\left(y-\left\langle X,\beta \right\rangle\right)\Sigma(\theta)^{-1}\left(y-\left\langle X,\beta \right\rangle\right)\]
B. Iterative Estimation: In multivariate case, Zivot and Wang (2007) show that estimation of \(\beta\) by \(\hat{\beta}_{ML}\) is equivalent to the iterative estimation of \(\hat{\beta}\) recomputed at each iteration by the update estimator of \(\Sigma\).
Begin with a preliminary estimation of \(\hat{\theta}=\theta_0\). Compute \(\hat{W}=\Sigma(\theta_0)^{-1}\).
Estimate \({b}_\Sigma={(Z^\prime\hat{W}Z)^{-1}Z^\prime\hat{W}}y\)
Based on \(\hat{e}=({y-{Z}{b}_\Sigma})\), update \(\hat{\theta}=\rho({\hat{e}})\) where \(\rho\) depends on the dependence structure chosen.
Repeat steps 2 and 3 until convergence.
The generalized correlated cross-validation (GCCV) criterion is an extension to GCV within the context of correlated errors, Carmack, Spence, and Schucany (2012). It is defined as follows:
\[GCCV(K_x,K_\beta,\mathbf{b},\phi)=\frac{\sum_{i=1}^n \left(y_{i}-\hat{y}_{i,\mathbf{b}}\right)^2}{ \left({1-\frac{{tr}(\mathbf{G})}{n}}\right)^2} \]
where \({G}=2{H}\Sigma(\phi)-{H}\Sigma(\phi)H^\prime\) takes into account the effect of the dependence, the trace of \({G}\) is an estimation of the degrees of freedom consumed by the model and \({H}\) is the hat matrix.
The important advantage of this criterion is that it is rather easy to compute because it avoids the need to compute the inverse of the matrix \(\Sigma\). Even so, the complexity of the GLS criterion depends on the structure of \(\Sigma\) and it could sometimes be hard either to minimize or computationally expensive.
2.8.1 Dependent data example,
We use the fregre.gls()
function that has the same arguments as the fregre.lm()
function and: correlation argument, same functionality as in gls()
and criteria argument, it require GCCV.S()
function to calculate the GCCV score proposed by Carmack, Spence, and Schucany (2012).
## Water
## Fat -0.9881002
## Protein
## Fat -0.8608965
##
## dcor t-test of independence
##
## data: D1 and D2
## T = 571.71, df = 22789, p-value < 2.2e-16
## sample estimates:
## Bias corrected dcor
## 0.9668619
##
## dcor t-test of independence
##
## data: D1 and D2
## T = 155.43, df = 22789, p-value < 2.2e-16
## sample estimates:
## Bias corrected dcor
## 0.7173448
x.d2<-fdata.deriv(tecator[["absorp.fdata"]],nderiv=2)
ldata=list("df"=tecator[["y"]],"x.d2"=x.d2)
res.gls=fregre.gls(Fat~x.d2, data=ldata, correlation=corAR1())
coef(res.gls[["modelStruct"]],F)
## corStruct.Phi
## 0.4942661
The previous model is restricted to a structure determined by gls()
function of nlme The function fregre.igls()
is presented as an alternative because it allows any type of dependence structures designed by the user.
The code bellow shows a simple use of iterative scheme (iGLS). In particular, we use a iGLS-AR(\(p=1\)) scheme for error estimation.
res.igls=fregre.igls(Fat~x.d2, data=ldata, correlation=list("cor.ARMA"=list()),control=list("p"=1))
coef(res.igls[["corStruct"]][[1]])
## ar1
## 0.488854
##
## Call:
## list("fregre.basis")
##
## Coefficients:
## (Intercept) x.d2.bspl4.1 x.d2.bspl4.2 x.d2.bspl4.3 x.d2.bspl4.4
## 18.12 -608.20 6203.09 -8252.76 6271.43
## x.d2.bspl4.5
## -7156.85
## $ar
##
## Call:
## arima(x = x, order = c(p, d, q), include.mean = FALSE, transform.pars = TRUE)
##
## Coefficients:
## ar1
## 0.4889
## s.e. 0.0600
##
## sigma^2 estimated as 8.076: log likelihood = -529.76, aic = 1063.53
Both examples estimate an AR(1) with \(\phi=0.49\). Thus, the estimation and the prediction made with these models will be more accurate than the classical functional models in which it is assumed that the errors are independent.
2.9 Functional Response Model
Reference papers: Faraway (1997), Frédéric Ferraty, Van Keilegom, and Vieu (2012)
R expample of function fregre.basis.fr()
data(aemet)
log10precfdata<-aemet$logprec; tempfdata<-aemet$temp
res2<-fregre.basis.fr(tempfdata,log10precfdata)
i<-1
plot(log10precfdata[i],lty=1,main=paste0("Weather station, ",i))
lines(res2$fitted.values[i],lty=2,lwd=2,col=4)
Febrero–Bande, Manuel, et al. “Functional regression models with functional response: a new approach and a comparative study.” Computational Statistics (2024): 1-27.
2.10 Other Models:
- Functional Quantile Regession Model, see Kato et al. (2012), Cardot, Crambes, and Sarda (2005).
- Functional Single Index Model, see Frédéric Ferraty, Park, and Vieu (2011).
- Functional Projection Pursuit Regression Model, see Frédéric Ferraty et al. (2013).
- Functional Machine Learning methods (SVM, RPART, NNET, random Forest)
Among others.