In probability and statistics, a random variable or stochastic variable is a variable whose value is subject to variations due to chance (i.e. randomness, in a mathematical sense). As opposed to other mathematical variables, a random variable conceptually does not have a single, fixed value (even if unknown); rather, it can take on a set of possible different values, each with an associated probability.
A random variable's possible values might represent the possible outcomes of a yet-to-be-performed experiment or an event that has not happened yet, or the potential values of a past experiment or event whose already-existing value is uncertain (e.g. as a result of incomplete information or imprecise measurements). They may also conceptually represent either the results of an "objectively" random process (e.g. rolling a die), or the "subjective" randomness that results from incomplete knowledge of a quantity. The meaning of the probabilities assigned to the potential values of a random variable is not part of probability theory itself, but instead related to philosophical arguments over the interpretation of probability. The mathematics works the same regardless of the particular interpretation in use.
Random variables can be classified as either discrete (i.e. it may assume any of a specified list of exact values) or as continuous (i.e. it may assume any numerical value in an interval or collection of intervals). The mathematical function describing the possible values of a random variable and their associated probabilities is known as a probability distribution. The realizations of a random variable, i.e. the results of randomly choosing values according to the variable's probability distribution are called random variates.
The basic concept of "random variable" in statistics is real-valued. However, one can consider arbitrary types such as boolean values, categorical variables, complex numbers, vectors, matrices, sequences, trees, sets, shapes, manifolds, functions, and processes. The term random element is used to encompass all such related concepts. An example is the stochastic process, a set of indexed random variables (typically indexed by time or space). These more general concepts are particularly useful in fields such as computer science and natural language processing where many of the basic elements of analysis are non-numerical. Such general random elements can sometimes be treated as sets of real-valued random variables — often more specifically as random vectors. For example:
- A "random word" may be parameterized by an integer-valued index into the vocabulary of possible words; or alternatively as an indicator vector, in which exactly one element is a 1 and the others are 0, with the 1 indexing a particular word into a vocabulary.
- A "random sentence" may be parameterized as a vector of random words.
- A random graph, for a graph with V edges, may be parameterized as an NxN matrix, indicating the weight for each edge, or 0 for no edge. (If the graph has no weights, 1 indicates an edge, 0 indicates no edge.)
However, reduction to numerical values is not essential for dealing with random elements: a randomly selected individual remains an individual, not a number.
The formal mathematical treatment of random variables is dealt with in the subject of probability theory. In that context, random variables are defined in terms of functions defined on a probability space.