Linebreak - in Programming Languages

In Programming Languages

To facilitate the creation of portable programs, programming languages provide some abstractions to deal with the different types of newline sequences used in different environments.

The C programming language provides the escape sequences '\n' (newline) and '\r' (carriage return). However, these are not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two things:

  1. Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
  2. When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, no translation is performed, and the internal representation produced by '\n' is output directly.

On Unix platforms, where C originated, the native newline sequence is ASCII LF (0x0A), so '\n' was simply defined to be that value. With the internal and external representation being identical, the translation performed in text mode is a no-op, and text mode and binary mode behave the same. This has caused many programmers who developed their software on Unix systems simply to ignore the distinction completely, resulting in code that is not portable to different platforms.

The C library function fgets is best avoided in binary mode because any file not written with the UNIX newline convention will be misread. Also, in text mode, any file not written with the system's native newline sequence (such as a file created on a UNIX system, then copied to a Windows system) will be misread as well.

Another common problem is the use of '\n' when communicating using an Internet protocol that mandates the use of ASCII CR+LF for ending lines. Writing '\n' to a text mode stream works correctly on Windows systems, but produces only LF on Unix, and something completely different on more exotic systems. Using "\r\n" in binary mode is slightly better.

Many languages, such as C++, Perl, and Haskell provide the same interpretation of '\n' as C.

Java, PHP and Python provide the '\r\n' sequence (for ASCII CR+LF). In contrast to C, these are guaranteed to represent the values U+000D and U+000A, respectively.

The Java I/O libraries do not transparently translate these into platform-dependent newline sequences on input or output. Instead, they provide functions for writing a full line that automatically add the native newline sequence, and functions for reading lines that accept any of CR, LF, or CR+LF as a line terminator (see BufferedReader.readLine). The System.getProperty method can be used to retrieve the underlying line separator.

Example:

String eol = System.getProperty( "line.separator" ); String lineColor = "Color: Red" + eol;

Python permits "Universal Newline Support" when opening a file for reading, when importing modules, and when executing a file.

Some languages have created special variables, constants, and subroutines to facilitate newlines during program execution.

In PHP, to avoid portability problems, newline sequences should be issued using the PHP_EOL constant.

Read more about this topic:  Linebreak

Famous quotes containing the words programming and/or languages:

    If there is a price to pay for the privilege of spending the early years of child rearing in the driver’s seat, it is our reluctance, our inability, to tolerate being demoted to the backseat. Spurred by our success in programming our children during the preschool years, we may find it difficult to forgo in later states the level of control that once afforded us so much satisfaction.
    Melinda M. Marshall (20th century)

    Wealth is so much the greatest good that Fortune has to bestow that in the Latin and English languages it has usurped her name.
    William Lamb Melbourne, 2nd Viscount (1779–1848)