The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Why is Cassandra not considered a relational database?

This is the Q that just got deleted from StackOverflow. It seems reasonable enough, I'll put it here verbatim, although there's clearly some confusion. The Q should of course be '... not considered a relational database management system?'

Read this answer

Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish the relationships among data points.

Cassandra has tables.

CREATE TABLE movies (
    movie_id UUID,
    title TEXT,
    release_year INT,
    PRIMARY KEY (( movie_id ))
);

Data being relational has nothing to do with support of ACID properties.

Data being relational has nothing to do with normalised data


  1. Why Cassandra is not considered Relational database?
  2. Why Cassandra is considered NoSQL database? despite it has tables
  1. Why Cassandra is not considered Relational database?

Because it's sold as a NoSQL management system. Because its data organisation is a key-value store (see the second paragraph at that link).

2. Why Cassandra is considered NoSQL database? despite it has tables

Having tables is not a defining characteristic of being relational. It's not even a defining characteristic of being a database: there are 'times tables', logarithm tables, ... Tables (that is, data in columns and rows) long pre-dated computers, manual filing systems and then computer filing systems adopted/adapted tables. Before the Relational model, 'Indexed-sequential' file systems and Hierarchical dabatase management systems organised data into tables.

From wikipedia on Cassandra

Unlike a table in an RDBMS, different rows in the same column family do not have to share the same set of columns, and a column may be added to one or multiple rows at any time.

So Cassandra has 'column families' -- which RDBMSs don't; tables in a RDBMS must all have the same set of columns -- Cassandra wotsits don't.

Although NoSQL DBMSs are usually described as 'key-value stores', they can still present data as tables, and indeed some support queries in SQL -- probably a limited form of SQL.

Data being relational has nothing to do with support of ACID properties.

Data being relational has nothing to do with normalised data

Neither of those claims are entirely true. Organising data relationally means using keys (potentially several per table) and Foreign Keys, and most critically having the RDBMS enforce uniqueness of keys and integrity of FK references. You need at least a minimal amount of transaction control/data integrity to enforce that, even if it's not up to the level of ACID. (For example the DBMS must reject a transaction deleting a row with a referenced key.) NoSQL/key-value stores do not necessarily even allow you to declare multiple keys and Foreign Keys, let alone enforce integrity.

From wikipedia on Cassandra:

Tunable consistency

Cassandra is typically classified as an AP system, meaning that availability and partition tolerance are generally considered to be more important than consistency in Cassandra,[17] Writes and reads offer a tunable level of consistency, all the way from "writes never fail" to "block for all replicas to be readable", with the quorum level in the middle.[18]

In a RDBMS nothing is "more important than consistency". All the data content at all times visible to end-users must be consistent with the declared constraints: keys (uniqueness) and Foreign Keys, and other restrictions (if your RDBMS supports them).

Data in a relational database must be at least normalised enough to declare a key for every table, and that means every row must have a non-null value in every column participating in the key(s). Data in a relational database must be at least normalised enough for every row in a table to have all and only the same columns. That's (apparently) not true for a Cassandra 'column family'. In key-value stores, the key is typically a surrogate for the row (maybe a hash), not a value that the users of the database's content would recognise as data.

 

This q also got some answers on StackExchange https://dba.stackexchange.com/questions/280929/why-cassandra-is-not-considered-a-relational-database