Skip to contents

Computes functional linear regression between functional explanatory variable \(X(t)\) and scalar response \(Y\) using penalized Partial Least Squares (PLS) $$Y=\big<\tilde{X},\beta\big>+\epsilon=\int_{T}{\tilde{X}(t)\beta(t)dt+\epsilon}$$ where \( \big< \cdot , \cdot \big>\) denotes the inner product on \(L_2\) and \(\epsilon\) are random errors with mean zero , finite variance \(\sigma^2\) and \(E[\tilde{X}(t)\epsilon]=0\).
\(\left\{\nu_k\right\}_{k=1}^{\infty}\) orthonormal basis of PLS to represent the functional data as \(X_i(t)=\sum_{k=1}^{\infty}\gamma_{ik}\nu_k\).

Usage

fregre.pls(fdataobj, y = NULL, l = NULL, lambda = 0, P = c(0, 0, 1), ...)

Arguments

fdataobj

fdata class object.

y

Scalar response with length n.

l

Index of components to include in the model.

lambda

Amount of penalization. Default value is 0, i.e. no penalization is used.

P

If P is a vector: P are coefficients to define the penalty matrix object. By default P=c(0,0,1) penalize the second derivative (curvature) or acceleration. If P is a matrix: P is the penalty matrix object.

...

Further arguments passed to or from other methods.

Value

Return:

  • call: The matched call of fregre.pls function.

  • beta.est: Beta coefficient estimated of class fdata.

  • coefficients: A named vector of coefficients.

  • fitted.values: Estimated scalar response.

  • residuals: y minus fitted values.

  • H: Hat matrix.

  • df.residual: The residual degrees of freedom.

  • r2: Coefficient of determination.

  • GCV: GCV criterion.

  • sr2: Residual variance.

  • l: Index of components to include in the model.

  • lambda: Amount of shrinkage.

  • fdata.comp: Fitted object in fdata2pls function.

  • lm: Fitted object in lm function.

  • fdataobj: Functional explanatory data.

  • y: Scalar response.

Details

Functional (FPLS) algorithm maximizes the covariance between \(X(t)\) and the scalar response \(Y\) via the partial least squares (PLS) components. The functional penalized PLS are calculated in fdata2pls by alternative formulation of the NIPALS algorithm proposed by Kraemer and Sugiyama (2011).
Let \(\left\{\tilde{\nu}_k\right\}_{k=1}^{\infty}\) the functional PLS components and \(\tilde{X}_i(t)=\sum_{k=1}^{\infty}\tilde{\gamma}_{ik}\tilde{\nu}_k\) and \(\beta(t)=\sum_{k=1}^{\infty}\tilde{\beta}_k\tilde{\nu}_k\). The functional linear model is estimated by: $$\hat{y}=\big< X,\hat{\beta} \big> \approx \sum_{k=1}^{k_n}\tilde{\gamma}_{k}\tilde{\beta}_k $$
The response can be fitted by:

  • \(\lambda=0\), no penalization, $$\hat{y}=\nu_k^{\top}(\nu_k^{\top}\nu_k)^{-1}\nu_k^{\top}y$$

    • Penalized regression, \(\lambda>0\) and \(P\neq0\). For example, \(P=c(0,0,1)\) penalizes the second derivative (curvature) by P=P.penalty(fdataobj["argvals"],P), $$\hat{y}=\nu_k^{\top}(\nu_k\top \nu_k+\lambda \nu_k^{\top} \textbf{P}\nu_k)^{-1}\nu_k^{\top}y$$

References

Preda C. and Saporta G. PLS regression on a stochastic process. Comput. Statist. Data Anal. 48 (2005): 149-158.

N. Kraemer, A.-L. Boulsteix, and G. Tutz (2008). Penalized Partial Least Squares with Applications to B-Spline Transformations and Functional Data. Chemometrics and Intelligent Laboratory Systems, 94, 60 - 69. doi:10.1016/j.chemolab.2008.06.009

Martens, H., Naes, T. (1989) Multivariate calibration. Chichester: Wiley.

Kraemer, N., Sugiyama M. (2011). The Degrees of Freedom of Partial Least Squares Regression. Journal of the American Statistical Association. Volume 106, 697-705.

Febrero-Bande, M., Oviedo de la Fuente, M. (2012). Statistical Computing in Functional Data Analysis: The R Package fda.usc. Journal of Statistical Software, 51(4), 1-28. https://www.jstatsoft.org/v51/i04/

See also

See Also as: P.penalty and fregre.pls.cv.
Alternative method: fregre.pc.

Author

Manuel Febrero-Bande, Manuel Oviedo de la Fuente manuel.oviedo@udc.es

Examples

if (FALSE) { # \dontrun{
data(tecator)
x <- tecator$absorp.fdata
y <- tecator$y$Fat
res <- fregre.pls(x,y,c(1:4))
summary(res)
res1 <- fregre.pls(x,y,l=1:4,lambda=100,P=c(1))
res4 <- fregre.pls(x,y,l=1:4,lambda=1,P=c(0,0,1))
summary(res4)#' plot(res$beta.est)
lines(res1$beta.est,col=4)
lines(res4$beta.est,col=2)
} # }