Impact points selection of functional predictor and regression using local maxima distance correlation (LMDC)
Source:R/LMDC.select.R
LMDC.select.Rd
LMDC.select function selects impact points of functional predictior using
local maxima distance correlation (LMDC) for a scalar response given.
LMDC.regre function fits a multivariate regression method using the selected
impact points like covariates for a scalar response.
Usage
LMDC.select(
y,
covar,
data,
tol = 0.06,
pvalue = 0.05,
plot = FALSE,
local.dc = TRUE,
smo = FALSE,
verbose = FALSE
)
LMDC.regre(
y,
covar,
data,
newdata,
pvalue = 0.05,
method = "lm",
par.method = NULL,
plot = FALSE,
verbose = FALSE
)
Arguments
- y
name of the response variable.
- covar
vector with the names of the covaviables (or points of impact) with length
p
.- data
data frame with length n rows and at least p + 1 columns, containing the scalar response and the potencial p covaviables (or points of impact) in the model.
- tol
Tolerance value for distance correlation and imapct point.
- pvalue
pvalue of bias corrected distance correlation t-test.
- plot
logical value, if TRUE plots the distance correlation curve for each covariate in multivariate case and in each discretization points (argvals) in the functional case.
- local.dc
Compute local distance correlation.
- smo
logical. If TRUE, the curve of distance correlation computed in the impact points is smoothed using B-spline representation with a suitable number of basis elements.
- verbose
print iterative and relevant steps of the procedure.
- newdata
An optional data frame in which to look for variables with which to predict.
- method
Name of regression method used, see details. This argument is used in do.call function like "what" argument.
- par.method
List of parameters used to call the method. This argument is used in do.call function like "args" argument.
Value
LMDC.select
function returns a list of two elements:
cor
the value of distance correlation for each covariate.
maxLocal
index or locations of local maxima distance correlations.
LMDC.regre
function returns a list of folowing elements:
model
object corresponding to the estimated method using the selected variables.
xvar
names of selected variables (impact points).
edf
effective degrees of freedom.
nvar
number of selected variables (impact points).
Details
String of characters corresponding to the name of the regression method called. Model available options:
"lm"
: Step-wise lm regression model (uses lm function, stats package). Recommended for linear models, test linearity using.flm.test
function."gam"
: Step-wise gam regression model (uses gam function, mgcv package). Recommended for non-linear models.
Models that use the indicated function of the required package:
"svm"
: Support vector machine (svm function, e1071 package)."knn"
: k-nearest neighbor regression (knnn.reg function, FNN package)."lars"
: Least Angle Regression using Lasso (lars function, lars package)."glmnet"
: Lasso and Elastic-Net Regularized Generalized Linear Models (glmnet and cv.glmnet function, glmnet package)."rpart"
: Recursive partitioning for regression a (rpart function, rpart package)."flam"
: Fit the Fused Lasso Additive Model for a Sequence of Tuning Parameters (flam function, flam package)."novas"
: NOnparametric VAriable Selection (code available in
https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NOVAS/novas-routines.R)."cosso"
: Fit Regularized Nonparametric Regression Models Using COSSO Penalty (cosso function, cosso package)."npreg"
: kernel regression estimate of a one (1) dimensional dependent variable on p-variate explanatory data (npreg function, np package)."mars"
: Multivariate adaptive regression splines (mars function, mda package)."nnet"
: Fit Neural Networks (nnet function, nnet package)."lars"
: Fits Least Angle Regression, Lasso and Infinitesimal Forward Stagewise regression models (lars function, lars package).
References
Ordonez, C., Oviedo de la Fuente, M., Roca-Pardinas, J., Rodriguez-Perez, J. R. (2018). Determining optimum wavelengths for leaf water content estimation from reflectance: A distance correlation approach. Chemometrics and Intelligent Laboratory Systems. 173,41-50 doi:10.1016/j.chemolab.2017.12.001 .
Author
Manuel Oviedo de la Fuente manuel.oviedo@udc.es
Examples
if (FALSE) { # \dontrun{
data(tecator)
absorp=fdata.deriv(tecator$absorp.fdata,2)
ind=1:129
x=absorp[ind,]
y=tecator$y$Fat[ind]
newx=absorp[-ind,]
newy=tecator$y$Fat[-ind]
## Functional PC regression
res.pc=fregre.pc(x,y,1:6)
pred.pc=predict(res.pc,newx)
# Functional regression with basis representation
res.basis=fregre.basis.cv(x,y)
pred.basis=predict(res.basis[[1]],newx)
# Functional nonparametric regression
res.np=fregre.np.cv(x,y)
pred.np=predict(res.np,newx)
dat <- data.frame("y"=y,x$data)
newdat <- data.frame("y"=newy,newx$data)
res.gam=fregre.gsam(y~s(x),data=list("df"=dat,"x"=x))
pred.gam=predict(res.gam,list("x"=newx))
dc.raw <- LMDC.select("y",data=dat, tol = 0.05, pvalue= 0.05,
plot=F, smo=T,verbose=F)
covar <- paste("X",dc.raw$maxLocal,sep="")
# Preselected design/impact points
covar
ftest<-flm.test(dat[,-1],dat[,"y"], B=500, verbose=F,
plot.it=F,type.basis="pc",est.method="pc",p=4,G=50)
if (ftest$p.value>0.05) {
# Linear relationship, step-wise lm is recommended
out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
method ="lm",plot=F,verbose=F)
} else {
# Non-Linear relationship, step-wise gam is recommended
out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
method ="gam",plot=F,verbose=F) }
# Final design/impact points
out$xvar
# Predictions
mean((newy-pred.pc)^2)
mean((newy-pred.basis)^2)
mean((newy-pred.np)^2)
mean((newy-pred.gam)^2)
mean((newy-out$pred)^2)
} # }