Up to Cassandra 1.0 Cassandra was not row level consistent, meaning that inserts updates into the table that effect the same row that are processed at approximately the same time may effect the non key columns inconsistent ways. One update may affect one column while another effects the other, resulting in sets of values within the row that were never satisfied or intended. Cassandra 1.1 solved this issues by introducing row level isolation.
Deletion markers called " Tombstones" are known to cause performance degraditon up to severe consequence levels.
Cassandra is wide column store, and such as, essentially a hybrid between a key value and tabular database management system. Its data model is a partitioned row store with tunable consistency. Rows are organized into tables; the first component of the table's primary key is the partition key; within the partition, rows are clustered by the remaining columns of the key. Other columns may be indexed separately from the primary key.
Tables are may be created, dropped, and altered at runtime without blocking updates and queries.
Cassandra cannot do joins or subqueries. Rather, Casssandra emphasizes denormalization through features like collections.
A column family (called "table" since CQL 3) resembles a table in a RDBMS (Relational Database Management system). Column families contain rows and columns. Each row is a uniquely identified by row key. Each row has a multiple columns, each of which has a name, value and a timestamp. Unlike a table in an RDBMS, different rows in the columns family do not have to share the same set of columns, and a column may be added to one or multiple rows at any time.
Each key in Cassandra correspondens to a value which is an object. Each key has a values as columns, and columns are grouped together into sets called columns families. Thus each key identifies row of a variable number of elements, These column families could be considered then as tables. A table in Cassandra is a distributed multi dimensional map indexed by a key. Furthermore, applications Can specify the sort order of columns within a super Column or simple Column family.
Management and Monitoring
Cassandra is a java based system that can be managed and monitored via java management Extensions (JMX). The JMX compliant nodetool utility, for instance, can be used to manage a cassandra cluster (adding nodes to a ring, draining nodes, decommissioning nodes and so on). Nodetool also offers a number of commands to return Cassandra metrics pertaining to this usage, latency, compaction, garbage collection, and more.
Since Cassandra 2.0.2 in 2013, measures of several metrics are produced via dropwizard metrics framework, and may be queried via JMX using tools such as JConsoleor passed to external monitoring system via dropwizard compatible reporter plugins.