Comma-separated Values - Basic Rules and Examples

Basic Rules and Examples

Many informal documents exist that describe "CSV" formats. IETF RFC 4180 (summarized above) defines the format for the "text/csv" MIME type registered with the IANA. (Shafranovich 2005) Another relevant specification is provided by Fielded Text. Creativyst (2010) provides an overview of the variations used in the most widely used applications and explains how CSV can best be used and supported.

Rules typical of these and other "CSV" specifications and implementations are as follow:

  • CSV is a delimited data format that has fields/columns separated by the comma character and records/rows terminated by newlines.
  • A CSV file does not require a specific character encoding, byte order, or line terminator format (some software does not support all line-end variations).
  • A record ends at a line terminator. However, line-terminators can be embedded as data within fields, so software must recognize quoted line-separators (see below) in order to correctly assemble an entire record from perhaps multiple lines.
  • All records should have the same number of fields, in the same order.
  • Data within fields is interpreted as a sequence of characters, not as a sequence of bits or bytes (see RFC 2046, section 4.1). For example, the numeric quantity 65535 may be represented as the 5 ASCII characters "65535" (or perhaps other forms such as "0xFFFF", "000065535.000E+00", etc.); but not as a sequence of 2 bytes intended to be treated as a single binary integer rather than as two characters. If this "plain text" convention is not followed, then the CSV file no longer contains sufficient information to interpret it correctly, the CSV file will not likely survive transmission across differing computer architectures, and will not conform to the text/csv MIME type.
  • Adjacent fields must be separated by a single comma. However, "CSV" formats vary greatly in this choice of separator character. In particular, in locales where the comma is used as a decimal separator, semicolon, TAB, or other characters are used instead.
1997,Ford,E350
  • Any field may be quoted (that is, enclosed within double-quote characters). Some fields must be quoted, as specified in following rules.
"1997","Ford","E350"
  • Fields with embedded commas must be quoted.
1997,Ford,E350,"Super, luxurious truck"
  • Each of the embedded double-quote characters must be represented by a pair of double-quote characters.
1997,Ford,E350,"Super, ""luxurious"" truck"
  • Fields with embedded line breaks must be quoted (however, many CSV implementations simply do not support this).
1997,Ford,E350,"Go get one now they are going fast"
  • In some CSV implementations, leading and trailing spaces and tabs are trimmed. This practice is controversial, and does not accord with RFC 4180, which states "Spaces are considered part of a field and should not be ignored."
1997, Ford, E350 not same as 1997,Ford,E350
  • In CSV implementations that do trim leading or trailing spaces, fields with such spaces as meaningful data must be quoted.
1997,Ford,E350," Super luxurious truck "
  • The first record may be a "header", which contains column names in each of the fields (there is no reliable way to tell whether a file does this or not; however, it is uncommon to use characters other than letters, digits, and underscores in such column names).
Year,Make,Model 1997,Ford,E350 2000,Mercury,Cougar

Read more about this topic:  Comma-separated Values

Famous quotes containing the words basic, rules and/or examples:

    It seems to me that our three basic needs, for food and security and love, are so mixed and mingled and entwined that we cannot straightly think of one without the others. So it happens that when I write of hunger, I am really writing about love and the hunger for it, and warmth and the love of it and the hunger for it ... and then the warmth and richness and fine reality of hunger satisfied ... and it is all one.
    M.F.K. Fisher (b. 1908)

    There are two great rules in life, the one general and the other particular. The first is that every one can in the end get what he wants if he only tries. This is the general rule. The particular rule is that every individual is more or less of an exception to the general rule.
    Samuel Butler (1835–1902)

    In the examples that I here bring in of what I have [read], heard, done or said, I have refrained from daring to alter even the smallest and most indifferent circumstances. My conscience falsifies not an iota; for my knowledge I cannot answer.
    Michel de Montaigne (1533–1592)