Skip to contents

LMDC.select function selects impact points of functional predictior using local maxima distance correlation (LMDC) for a scalar response given.
LMDC.regre function fits a multivariate regression method using the selected impact points like covariates for a scalar response.

Usage

LMDC.select(
  y,
  covar,
  data,
  tol = 0.06,
  pvalue = 0.05,
  plot = FALSE,
  local.dc = TRUE,
  smo = FALSE,
  verbose = FALSE
)

LMDC.regre(
  y,
  covar,
  data,
  newdata,
  pvalue = 0.05,
  method = "lm",
  par.method = NULL,
  plot = FALSE,
  verbose = FALSE
)

Arguments

y

name of the response variable.

covar

vector with the names of the covaviables (or points of impact) with length p.

data

data frame with length n rows and at least p + 1 columns, containing the scalar response and the potencial p covaviables (or points of impact) in the model.

tol

Tolerance value for distance correlation and imapct point.

pvalue

pvalue of bias corrected distance correlation t-test.

plot

logical value, if TRUE plots the distance correlation curve for each covariate in multivariate case and in each discretization points (argvals) in the functional case.

local.dc

Compute local distance correlation.

smo

logical. If TRUE, the curve of distance correlation computed in the impact points is smoothed using B-spline representation with a suitable number of basis elements.

verbose

print iterative and relevant steps of the procedure.

newdata

An optional data frame in which to look for variables with which to predict.

method

Name of regression method used, see details. This argument is used in do.call function like "what" argument.

par.method

List of parameters used to call the method. This argument is used in do.call function like "args" argument.

Value

LMDC.select

function returns a list of two elements:

cor

the value of distance correlation for each covariate.

maxLocal

index or locations of local maxima distance correlations.

LMDC.regre

function returns a list of folowing elements:

model

object corresponding to the estimated method using the selected variables.

xvar

names of selected variables (impact points).

edf

effective degrees of freedom.

nvar

number of selected variables (impact points).

Details

String of characters corresponding to the name of the regression method called. Model available options:

  • "lm": Step-wise lm regression model (uses lm function, stats package). Recommended for linear models, test linearity using. flm.test function.

  • "gam": Step-wise gam regression model (uses gam function, mgcv package). Recommended for non-linear models.

Models that use the indicated function of the required package:

  • "svm": Support vector machine (svm function, e1071 package).

  • "knn": k-nearest neighbor regression (knnn.reg function, FNN package).

  • "lars": Least Angle Regression using Lasso (lars function, lars package).

  • "glmnet": Lasso and Elastic-Net Regularized Generalized Linear Models (glmnet and cv.glmnet function, glmnet package).

  • "rpart": Recursive partitioning for regression a (rpart function, rpart package).

  • "flam": Fit the Fused Lasso Additive Model for a Sequence of Tuning Parameters (flam function, flam package).

  • "novas": NOnparametric VAriable Selection (code available in
    https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NOVAS/novas-routines.R).

  • "cosso": Fit Regularized Nonparametric Regression Models Using COSSO Penalty (cosso function, cosso package).

  • "npreg": kernel regression estimate of a one (1) dimensional dependent variable on p-variate explanatory data (npreg function, np package).

  • "mars": Multivariate adaptive regression splines (mars function, mda package).

  • "nnet": Fit Neural Networks (nnet function, nnet package).

  • "lars": Fits Least Angle Regression, Lasso and Infinitesimal Forward Stagewise regression models (lars function, lars package).

References

Ordonez, C., Oviedo de la Fuente, M., Roca-Pardinas, J., Rodriguez-Perez, J. R. (2018). Determining optimum wavelengths for leaf water content estimation from reflectance: A distance correlation approach. Chemometrics and Intelligent Laboratory Systems. 173,41-50 doi:10.1016/j.chemolab.2017.12.001 .

See also

See Also as: lm, gam, dcor.xy.

Author

Manuel Oviedo de la Fuente manuel.oviedo@udc.es

Examples

if (FALSE) { # \dontrun{
data(tecator)
absorp=fdata.deriv(tecator$absorp.fdata,2)
ind=1:129
x=absorp[ind,]
y=tecator$y$Fat[ind]
newx=absorp[-ind,]
newy=tecator$y$Fat[-ind]

## Functional PC regression
res.pc=fregre.pc(x,y,1:6)
pred.pc=predict(res.pc,newx)

# Functional regression with basis representation
res.basis=fregre.basis.cv(x,y)
pred.basis=predict(res.basis[[1]],newx)

# Functional nonparametric regression
res.np=fregre.np.cv(x,y)
pred.np=predict(res.np,newx)

dat    <- data.frame("y"=y,x$data)
newdat <- data.frame("y"=newy,newx$data)

res.gam=fregre.gsam(y~s(x),data=list("df"=dat,"x"=x))
pred.gam=predict(res.gam,list("x"=newx))

dc.raw <- LMDC.select("y",data=dat, tol = 0.05, pvalue= 0.05,
                      plot=F, smo=T,verbose=F)
covar <- paste("X",dc.raw$maxLocal,sep="")                      
# Preselected design/impact points 
covar
ftest<-flm.test(dat[,-1],dat[,"y"], B=500, verbose=F,
    plot.it=F,type.basis="pc",est.method="pc",p=4,G=50)
    
if (ftest$p.value>0.05) { 
  # Linear relationship, step-wise lm is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="lm",plot=F,verbose=F)
} else {
 # Non-Linear relationship, step-wise gam is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="gam",plot=F,verbose=F) }  
             
# Final  design/impact points
out$xvar

# Predictions
mean((newy-pred.pc)^2)                
mean((newy-pred.basis)^2) 
mean((newy-pred.np)^2)              
mean((newy-pred.gam)^2) 
mean((newy-out$pred)^2)
} # }