Kernel Density Estimation - Statistical Implementation

Statistical Implementation

A non-exhaustive list of software implementations of kernel density estimators includes:

  • In Analytica release 4.4, the Smoothing option for PDF results uses KDE, and from expressions it is available via the built-in Pdf function.
  • In C/C++, FIGTree is a library that can be used to compute kernel density estimates using normal kernels. MATLAB interface available.
    • In C++, libagf is a library for variable kernel density estimation.
  • In CrimeStat, kernel density estimation is implemented using five different kernel functions – normal, uniform, quartic, negative exponential, and triangular. Both single- and dual-kernel density estimate routines are available. Kernel density estimation is also used in interpolating a Head Bang routine, in estimating a two-dimensional Journey-to-crime density function, and in estimating a three-dimensional Bayesian Journey-to-crime estimate.
  • In ESRI products, kernel density mapping is managed out of the Spatial Analyst toolbox and uses the Epanechnikov kernel.
  • In gnuplot, kernel density estimation is implemented by the smooth kdensity option, the datafile can contain a weight and bandwidth for each point, or the bandwidth can be set automatically.
  • In Haskell, kernel density is implemented in the statistics package.
  • In Java, the Weka (machine learning) package provides weka.estimators.KernelEstimator, among others.
  • In JavaScript, the visualization package D3 offers a KDE package in its science.stats package.
  • In JMP, The Fit Y by X platform can be used to estimate univariate and bivariate kernel densitities.
  • In MATLAB, kernel density estimation is implemented through the ksdensity function (Statistics Toolbox). This function does not provide an automatic data-driven bandwidth but uses a rule of thumb, which is optimal only when the target density is normal. A free MATLAB software package which implements an automatic bandwidth selection method is available from the MATLAB Central File Exchange for 1 dimensional data and for 2 dimensional data.
  • In Mathematica, numeric kernel density estimation is implemented by the function SmoothKernelDistribution here and symbolic estimation is implemented using the function KernelMixtureDistribution here both of which provide data-driven bandwidths.
  • In the NAG Library, kernel density estimation is implemented via the g10ba routine (available in both the Fortran and the C versions of the Library).
  • In Octave, kernel density estimation is implemented by the kernel_density option (econometrics package).
  • In Perl, an implementation can be found in the Statistics-KernelEstimation module
  • In Python, there is an implementation in the stats scipy package: Scipy Stats Package
  • In R, it is implemented through the density and the bkde function in the KernSmooth library (both included in the base distribution), the kde function in the ks library, the npudens function in the np library (numeric and categorical data), the sm.density function in the sm library. For an implementation of the kde.R function, which does not require installing any packages or libraries, see kde.R.
  • In SAS, proc kde can be used to estimate univariate and bivariate kernel densities.
  • In SciPy, scipy.stats.gaussian_kde can be used to perform gaussian kernel density estimation in arbitrary dimensions, including bandwidth estimation.
  • In Stata, it is implemented through kdensity; for example histogram x, kdensity. Alternatively a free Stata module KDENS is available from here allowing a user to estimate 1D or 2D density functions.

Read more about this topic:  Kernel Density Estimation