Content-addressable Storage - Content-addressed Vs. Location-addressed

Content-addressed Vs. Location-addressed

When being contrasted with content-addressed storage, a typical local or networked storage device is referred to as location-addressed. In a location-addressed storage device, each element of data is stored onto the physical medium, and its location recorded for later use. The storage device often keeps a list, or directory, of these locations. When a future request is made for a particular item, the request includes only the location (for example, path and file names) of the data. The storage device can then use this information to locate the data on the physical medium, and retrieve it. When new information is written into a location-addressed device, it is simply stored in some available free space, without regard to its content. The information at a given location can usually be altered or completely overwritten without any special action on the part of the storage device.

Within the scope of this discussion, a good way to think of the above is as container-addressed storage.

The Content Addressable File Store (CAFS) was a hardware device developed and sold by International Computers Limited (ICL) in the 1970s and 1980s that provided location-addressed disk storage with built-in search capability. The search logic was incorporated into the disk controller. A query expressed in a high-level query language could be compiled into a search specification that was then sent to the disk controller for execution. Files could also be accessed via the conventional location-addressing mechanism, permitting CAFS to support an IDMS CODASYL database and also support content addressing of the same records.

In contrast, when information is stored into a CAS system, the system will record a content address, which is an identifier uniquely and permanently linked to the information content itself. A request to retrieve information from a CAS system must provide the content identifier, from which the system can determine the physical location of the data and retrieve it. Because the identifiers are based on content, any change to a data element will necessarily change its content address. In nearly all cases, a CAS device will not permit editing information once it has been stored. Whether it can be deleted is often controlled by a policy.

While the idea of content-addressed storage is not new, production-quality systems were not readily available until roughly 2003. In mid-2004, the industry group SNIA began working with a number of CAS providers to create standard behavior and interoperability guidelines for CAS systems.

Read more about this topic:  Content-addressable Storage