Shards Compared To Horizontal Partitioning
Horizontal partitioning splits one or more tables by row, usually within a single instance of a schema and a database server. It may offer an advantage by reducing index size (and thus search effort) provided that there is some obvious, robust, implicit way to identify in which table a particular row will be found, without first needing to search the index, e.g. the classic example of the 'CustomersEast
' and 'CustomersWest
' tables, where their zip code already indicates where they will be found.
Sharding goes beyond this: it partitions the problematic table(s) in the same way, but it does this across potentially multiple instances of the schema. The obvious advantage would be that search load for the large partitioned table can now be split across multiple servers (logical or physical), not just multiple indexes on the same logical server.
Splitting shards across multiple isolated instances requires more than simple horizontal partitioning. The hoped-for gains in efficiency would be lost, if querying the database required both instances to be queried, just to retrieve a simple dimension table. Beyond partitioning, sharding thus splits large partitionable tables across the servers, while smaller tables are replicated as complete units.
This is also why sharding is related to a shared nothing architecture - once sharded, each shard can live in a totally separate logical schema instance / physical database server / data center / continent. There is no ongoing need to retain shared access (from between shards) to the other unpartitioned tables in other shards.
This makes replication across multiple servers easy (simple horizontal partitioning can't). It is also useful for worldwide distribution of applications, where communications links between data centers would otherwise be a bottleneck.
There is also a requirement for some notification and replication mechanism between schema instances, so that the unpartitioned tables remain as closely synchronized as the application demands. This is a complex choice in the architecture of sharded systems: approaches range from making these effectively read-only (updates are rare and batched), to dynamically replicated tables (at the cost of reducing some of the distribution benefits of sharding) and many options in between.
Read more about this topic: Shard (database Architecture)
Famous quotes containing the words shards, compared and/or horizontal:
“False history gets made all day, any day,
the truth of the new is never on the news
False history gets written every day
...
the lesbian archaeologist watches herself
sifting her own life out from the shards shes piecing,
asking the clay all questions but her own.”
—Adrienne Rich (b. 1929)
“Even with limited intelligence, knowing oneself is not as difficult as some say, but to act according to what one has realized about oneself in real life is as difficult as practicing anything else, compared to theory.”
—Franz Grillparzer (17911872)
“In bourgeois society, the French and the industrial revolution transformed the authorization of political space. The political revolution put an end to the formalized hierarchy of the ancien regimé.... Concurrently, the industrial revolution subverted the social hierarchy upon which the old political space was based. It transformed the experience of society from one of vertical hierarchy to one of horizontal class stratification.”
—Donald M. Lowe, U.S. historian, educator. History of Bourgeois Perception, ch. 4, University of Chicago Press (1982)