MUMPS - Overview

Overview

MUMPS is a language intended for and designed to build database applications. Secondary language features were included to help programmers make applications using minimal computing resources. The original implementations were interpreted, though modern implementations may be fully or partially compiled. Individual "programs" run in memory "partitions". Early MUMPS memory partitions were limited to 2048 bytes so aggressive abbreviation greatly aided multi-programming on severely resource limited hardware, because more than one MUMPS job could fit into the very small memories extant in hardware at the time. The ability to provide multi-user systems was another language design. The Multi-Programming in the acronym of language name point to this. Even the earliest machines running MUMPS supported multiple jobs running at the same time. With the change from mini-computers to micro-computers a few years later, even a "single user PC" with a single 8-bit CPU with 16K or 64K of memory could support multiple users, running dumb terminals in command line mode. (without any trace of a graphical user interface).

Since memory was tight originally, the language design for MUMPS valued very terse code. Thus, every MUMPS command or function name could be abbreviated from one to three letters in length, e.g. Quit (exit program) as Q, $P = $Piece function, R = Read command, $TR = $Translate function. Spaces and end-of-line markers are significant in MUMPS because line scope promoted the same terse language design. Hence, an entire line of program code could express the same idea a small number of characters that other programming languages might easily take 5 to 10 times as many characters to express. Abbreviation was a common feature of languages designed in this period (e.g., FOCAL-69, early BASICs such as Tiny BASIC, etc.). An unfortunate side effect of this coupled with the early need to write minimalist code was that MUMPS programmers routinely did not comment code and used extensive abbreviations, meaning that even an expert MUMPS programmer could not just skim through a page of code to see its function but would have to analyze it line by line.

Database interaction is transparently built into the language. The MUMPS language provides a hierarchical database made up of persistent sparse arrays, which is implicitly "opened" for every MUMPS application. All variable names prefixed with the caret character ("^") use permanent (instead of RAM) storage, will maintain their values after the application exits, and will be visible to (and modifiable by) other running applications. Variables using this shared and permanent storage are called Globals in MUMPS, because the scoping of these variables is "globally available" to all jobs on the system. The more recent and more common use of the name "global variables" in other languages is a more limited scoping of names, coming from the fact that unscoped variables are "globally" available to any programs running in the same process, but not shared among multiple processes. The MUMPS Storage mode (i.e. Globals stored as persistent sparse arrays), gives the MUMPS database the characteristics of a document-oriented database.

All variable names which are not prefixed with caret character ("^") are temporary and private. Like global variables, they also have a hierarchical storage model, but are only "locally available" to a single job, thus they are called "locals". Both "globals" and "locals" can have child nodes (called subscripts in MUMPS terminology). Subscripts are not limited to numerals—any ASCII character or group of characters can be a subscript identifier. While this is not uncommon for modern languages such as Perl or JavaScript, it was a highly unusual feature in the late 1970s. This capability was not universally implemented in MUMPS systems before the 1984 ANSI standard, as only canonically numeric subscripts were required by the standard to be allowed. Thus, the variable named 'Car' can have subscripts "Door", "Steering Wheel" and "Engine", each of which can contain a value and have subscripts of their own. The variable ^Car("Door") could have a nested variable subscript of "Color" for example. Thus, you could say

SET ^Car("Door","Color")="BLUE"

to modify a nested child node of ^Car. In MUMPS terms, "Color" is the 2nd subscript of the variable ^Car (both the names of the child-nodes and the child-nodes themselves are likewise called subscripts). Hierarchical variables are similar to objects with properties in many object oriented languages. Additionally, the MUMPS language design requires that all subscripts of variables are automatically kept in sorted order. Numeric subscripts (including floating-point numbers) are stored from lowest to highest. All non-numeric subscripts are stored in alphabetical order following the numbers. In MUMPS terminology, this is canonical order. By using only non-negative integer subscripts, the MUMPS programmer can emulate the arrays data type from other languages. Although MUMPS does not natively offer a full set of DBMS features such as mandatory schemas, several DBMS systems have been built on top of it that provide application developers with flat-file, relational and network database features.

Additionally, there are built-in operators which treat a delimited string (e.g., comma-separated values) as an array. Early MUMPS programmers would often store a structure of related information as a delimited string, parsing it after it was read in; this saved disk access time and offered considerable speed advantages on some hardware.

MUMPS has no data types. Numbers can be treated as strings of digits, or strings can be treated as numbers by numeric operators (coerced, in MUMPS terminology). Coercion can have some odd side effects, however. For example, when a string is coerced, the parser turns as much of the string (starting from the left) into a number as it can, then discards the rest. Thus the statement IF 20<"30 DUCKS" is evaluated as TRUE in MUMPS.

Other features of the language are intended to help MUMPS applications interact with each other in a multi-user environment. Database locks, process identifiers, and atomicity of database update transactions are all required of standard MUMPS implementations.

In contrast to languages in the C or Wirth traditions, some space characters between MUMPS statements are significant. A single space separates a command from its argument, and a space, or newline, separates each argument from the next MUMPS token. Commands which take no arguments (e.g., ELSE) require two following spaces. The concept is that one space separates the command from the (nonexistent) argument, the next separates the "argument" from the next command. Newlines are also significant; an IF, ELSE or FOR command processes (or skips) everything else til the end-of-line. To make those statements control multiple lines, you must use the DO command to create a code block.

Read more about this topic:  MUMPS