Density Estimation - Example of Density Estimation - Script For Example

Script For Example

The following commands of the R programming language will create the figures shown above. These commands can be entered at the command prompt by using cut and paste.

library (MASS) data(Pima.tr) data(Pima.te) Pima <- rbind (Pima.tr, Pima.te) glu <- Pima d0 <- Pima == 'No' d1 <- Pima == 'Yes' base.rate.d1 <- sum(d1)/(sum(d1) + sum(d0)) glu.density <- density (glu) glu.d0.density <- density (glu) glu.d1.density <- density (glu) approxfun (glu.d0.density$x, glu.d0.density$y) -> glu.d0.f approxfun (glu.d1.density$x, glu.d1.density$y) -> glu.d1.f p.d.given.glu <- function (glu, base.rate.d1) { p1 <- glu.d1.f(glu) * base.rate.d1 p0 <- glu.d0.f(glu) * (1 - base.rate.d1) p1/(p0+p1) } x <- 1:250 y <- p.d.given.glu (x, base.rate.d1) plot (x, y, type='l', col='red', xlab='glu', ylab='estimated p(diabetes|glu)') plot (density(glu), col='blue', xlab='glu', ylab='estimate p(glu), p(glu|diabetes), p(glu|not diabetes)', main=NA) lines (density(glu), col='red')

Note that the above conditional density estimator uses bandwidths that are optimal for unconditional densities. Alternatively, one could use the method of Hall, Racine and Li (2004) and the R np package for automatic (data-driven) bandwidth selection that is optimal for conditional density estimates; see the np vignette for an introduction to the np package. The following commands of the R programming language use the npcdens function to deliver optimal smoothing. Note that the response "Yes"/"No" is a factor.

library(np) fy.x <- npcdens(type~glu,nmulti=1,data=Pima) Pima.eval <- data.frame(type=factor("Yes"), glu=seq(min(Pima$glu),max(Pima$glu),length=250)) plot (x, y, type='l', lty=2, col='red', xlab='glu', ylab='estimated p(diabetes|glu)') lines(Pima.eval$glu,predict(fy.x,newdata=Pima.eval),col="blue") legend(0,1,c("Unconditional bandwidth", "Conditional bandwidth"), col=c("red","blue"),lty=c(2,1))

The third figure uses optimal smoothing via the method of Hall, Racine, and Li (2004) indicating that the unconditional density bandwidth used in the second figure above yields a conditional density estimate that may be somewhat undersmoothed.

Estimated probability of p(diabetes=1 | glu).

Read more about this topic:  Density Estimation, Example of Density Estimation

Famous quotes containing the word script:

    ...he sent letters to all the royal provinces, to every province in its own script and to every people in its own language, declaring that every man should be master in his own house.
    Bible: Hebrew, Esther 1:22.

    King Ahasuerus, after his Queen Vashti refused to come at the king’s command.