Kullback–Leibler distance

Measures the proximity between two groups of densities (of class fdata) by computing the Kullback–Leibler distance.

Usage

metric.kl(fdata1, fdata2 = NULL, symm = TRUE, base = exp(1), eps = 1e-10, ...)

Arguments

fdata1: Functional data 1 (fdata class) with the densities. The dimension of fdata1 object is (n1 x m), where n1 is the number of densities and m is the number of coordinates of the points where the density is observed.
fdata2: Functional data 2 (fdata class) with the densities. The dimension of fdata2 object is (n2 x m).
symm: If TRUE the symmetric K–L distance is computed, see details section.
base: The logarithm base used to compute the distance.
eps: Tolerance value.
...: Further arguments passed to or from other methods.

Details

Kullback–Leibler distance between $f(t)$ and $g(t)$ is $$metric.kl(f(t),g(t))= \int_{a}^{b} {f(t) log\left(\frac{f(t)}{g(t)}\right)dt}$$ where $t$ are the m coordinates of the points where the density is observed (the argvals of the fdata object).

The Kullback–Leibler distance is asymmetric, $$metric.kl(f(t),g(t))\neq metric.kl(g(t),f(t))$$ A symmetry version of K–L distance (by default) can be obtained by $$0.5\left(metric.kl(f(t),g(t))+metric.kl(g(t),f(t))\right)$$

If $\left(f_i(t)=0\ \& \ g_j(t)=0\right) \Longrightarrow metric.kl(f(t),g(t))=0$.

If $\left|f_i(t)-g_i(t) \right|\leq \epsilon \Longrightarrow f_i(t)=f_i(t)+\epsilon$, where $\epsilon$ is the tolerance value (by default eps=1e-10).

The coordinates of the points where the density is observed (discretization points $t$) can be equally spaced (by default) or not.

References

Kullback, S., Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22: 79-86

Author

Manuel Febrero-Bande, Manuel Oviedo de la Fuente manuel.oviedo@udc.es

Examples

if (FALSE) { # \dontrun{   
n<-201                                                                                       
tt01<-seq(0,1,len=n)                                                                         
rtt01<-c(0,1)  
x1<-dbeta(tt01,20,5)                                                                           
x2<-dbeta(tt01,21,5)                                                                           
y1<-dbeta(tt01,5,20)                                                                           
y2<-dbeta(tt01,5,21)                                                                           
xy<-fdata(rbind(x1,x2,y1,y2),tt01,rtt01)
plot(xy)
round(metric.kl(xy,xy,eps=1e-5),6)  
round(metric.kl(xy,eps=1e-5),6)
round(metric.kl(xy,eps=1e-6),6)
round(metric.kl(xy,xy,symm=FALSE,eps=1e-5),6)  
round(metric.kl(xy,symm=FALSE,eps=1e-5),6)

plot(c(fdata(y1[1:101]),fdata(y2[1:101])))                       
metric.kl(fdata(x1))  
metric.kl(fdata(x1),fdata(x2),eps=1e-5,symm=F)       
metric.kl(fdata(x1),fdata(x2),eps=1e-6,symm=F)       
metric.kl(fdata(y1[1:101]),fdata(y2[1:101]),eps=1e-13,symm=F)  
metric.kl(fdata(y1[1:101]),fdata(y2[1:101]),eps=1e-14,symm=F)  
} # }