Calgary Corpus

The Calgary Corpus is a collection of text and binary data files, commonly used for comparing data compression algorithms. It was created by Ian Witten, Tim Bell and John Cleary from the University of Calgary in 1987 and was commonly used in the 1990s. In 1997 it was replaced by the Canterbury Corpus, but the Calgary Corpus still exists for comparison and is still useful for its original intended purpose.

Read more about Calgary Corpus: Contents, Benchmarks, Compression Challenge

Famous quotes containing the word corpus:

“By that bedes side ther kneleth a may,
And she wepeth both nyght and day.

And by that beddes side ther stondith a ston,
‘Corpus Christi’wretyn theron.”
—Unknown. Corpus Christi Carol (l. 11–14)