Crlf - Conversion Utilities

Conversion Utilities

Text editors are often used for converting a text file between different newline formats; most modern editors can read and write files using at least the different ASCII CR/LF conventions. The standard Windows editor Notepad is not one of them (although Wordpad and the MS-DOS Editor are).

Editors are often unsuitable for converting larger files. For larger files (on Windows NT/2000/XP) the following command is often used:

TYPE unix_file | FIND "" /V > dos_file

On many Unix systems, the dos2unix (sometimes named fromdos or d2u) and unix2dos (sometimes named todos or u2d) utilities are used to translate between ASCII CR+LF (DOS/Windows) and LF (Unix) newlines. Different versions of these commands vary slightly in their syntax. However, the tr command is available on virtually every Unix-like system and is used to perform arbitrary replacement operations on single characters. A DOS/Windows text file can be converted to Unix format by simply removing all ASCII CR characters with

tr -d '\r' < inputfile > outputfile

or, if the text has only CR newlines, by converting all CR newlines to LF with

tr '\r' '\n' < inputfile > outputfile

The same tasks are sometimes performed with awk, sed, tr or in Perl if the platform has a Perl interpreter:

awk '{sub("$","\r\n"); printf("%s",$0);}' inputfile > outputfile # UNIX to DOS (adding CRs on Linux and BSD based OS that haven't GNU extensions) awk '{gsub("\r",""); print;}' inputfile > outputfile # DOS to UNIX (removing CRs on Linux and BSD based OS that haven't GNU extensions) sed -e 's/$/\r/' inputfile > outputfile # UNIX to DOS (adding CRs on Linux based OS that use GNU extensions) sed -e 's/\r$//' inputfile > outputfile # DOS to UNIX (removing CRs on Linux based OS that use GNU extensions) cat inputfile | tr -d "\r" > outputfile # DOS to UNIX (removing CRs using tr(1). Not Unicode compliant.) perl -pe 's/\r?\n|\r/\r\n/g' inputfile > outputfile # Convert to DOS perl -pe 's/\r?\n|\r/\n/g' inputfile > outputfile # Convert to UNIX perl -pe 's/\r?\n|\r/\r/g' inputfile > outputfile # Convert to old Mac

To identify what type of line breaks a text file contains, the file command can be used. Moreover, the editor Vim can be convenient to make a file compatible with the Windows notepad text editor. For example:

> file myfile.txt myfile.txt: ASCII English text > vim myfile.txt within vim :set fileformat=dos :wq > file myfile.txt myfile.txt: ASCII English text, with CRLF line terminators

The following grep commands echo the filename (in this case myfile.txt) to the command line if the file is of the specified style:

grep -PL $'\r\n' myfile.txt # show UNIX style file (LF terminated) grep -Pl $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

For Debian-based systems, these commands are used:

egrep -L $'\r\n' myfile.txt # show UNIX style file (LF terminated) egrep -l $'\r\n' myfile.txt # show DOS style file (CRLF terminated)

The above grep commands work under Unix systems or in Cygwin under Windows. Note that these commands make some assumptions about the kinds of files that exist on the system (specifically it's assuming only UNIX and DOS-style files—no Mac OS 9-style files).

This technique is often combined with find to list files recursively. For instance, the following command checks all "regular files" (e.g. it will exclude directories, symbolic links, etc.) to find all UNIX-style files in a directory tree, starting from the current directory (.), and saves the results in file unix_files.txt, overwriting it if the file already exists:

find . -type f -exec grep -PL '\r\n' {} \; > unix_files.txt

This example will find C files and convert them to LF style line endings:

find -name '*.' -exec fromdos {} \;

The file command also detects the type of EOL used:

file myfile.txt > myfile.txt: ASCII text, with CRLF line terminators

Other tools permit the user to visualise the EOL characters:

od -a myfile.txt cat -e myfile.txt hexdump -c myfile.txt

dos2unix, unix2dos, mac2unix, unix2mac, mac2dos, dos2mac can perform conversions. The flip command is often used.

Read more about this topic:  Crlf

Famous quotes containing the words conversion and/or utilities:

    The conversion of a savage to Christianity is the conversion of Christianity to savagery.
    George Bernard Shaw (1856–1950)

    Flowers ... are a proud assertion that a ray of beauty outvalues all the utilities of the world.
    Ralph Waldo Emerson (1803–1882)