Generalizations
Mercer's theorem itself is a generalization of the result that any positive semidefinite matrix is the Gramian matrix of a set of vectors.
The first generalization replaces the interval with any compact Hausdorff space and Lebesgue measure on is replaced by a finite countably additive measure μ on the Borel algebra of X whose support is X. This means that μ(U) > 0 for any open subset U of X.
A recent generalization replaces this conditions by that follows: the set X is a first-countable topological space endowed with a Borel (complete) measure μ. X is the support of μ and, for all x in X, there is an open set U containing x and having finite measure. Then essentially the same result holds:
Theorem. Suppose K is a continuous symmetric non-negative definite kernel on X. If the function κ is L1μ(X), where κ(x)=K(x,x), for all x in X, then there is an orthonormal set {ei}i of L2μ(X) consisting of eigenfunctions of TK such that corresponding sequence of eigenvalues {λi}i is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on X and K has the representation
where the convergence is absolute and uniform on compact subsets of X.
The next generalization deals with representations of measurable kernels.
Let (X, M, μ) be a σ-finite measure space. An L2 (or square integrable) kernel on X is a function
L2 kernels define a bounded operator TK by the formula
TK is a compact operator (actually it is even a Hilbert-Schmidt operator). If the kernel K is symmetric, by the spectral theorem, TK has an orthonormal basis of eigenvectors. Those eigenvectors that correspond to non-zero eigenvalues can be arranged in a sequence {ei}i (regardless of separability).
Theorem. If K is a symmetric non-negative definite kernel on(X, M, μ), then
where the convergence in the L2 norm. Note that when continuity of the kernel is not assumed, the expansion no longer converges uniformly.
Read more about this topic: Mercer's Theorem