The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Cloud Databases? Databases Without Borders!

Industry buzzword? I know.

The classical picture of the world (as opposed to QM) describes the entire world as having a single well defined state. Therefore, there should be only one database instance, "the cloud" if you will. Yet, the term is quite nebulous, and in practice it essentially means a bunch of database instances managed by some utility-like company.

Therefore, there is not much innovation in this cloud business. The cloud still divides information into separate parts (individual database instances), which is a problem for at least 3 reasons:
1. It multiplies complexity by management of multiple instances
2. It complicates queries involving data from distinct instances. In practice, direct database access to those instances is never advertised, or even allowed. This sad state of affairs spawned cottage industry of "application integration" tools.
3. It is entirely arbitrary, so in many customer environments the instances are separate for no reason
Cloud databases IMO should amend SQL standard so that instance division becomes transparent. This is just natural progression of the old idea that physical implementation of database engine is isolated from logical layer (SQL). After all a user who queries list of employees doesn't care where on file system Employees table is located. What needs to be done is just extending this idea and eliminating the second physical attribute -- the host name where this database is run.

I'm not sure what is the appropriate venue to nurture these ideas. From what I have seen Cloud Foundry seems to be an organization which focuses on "application as service" with little appreciation of database technology. Those block technology stack diagrams
http://docs.cloudfoundry.org/concepts/architecture/
were never really influential in our field.

More importantly, what would be incentives for various users to share the data? It would be nice if database world had some kind of currency traded for the access to the data...

 

I agree.

Indeed, in general "the cloud" -- which could mean transformative and seamless massively-shared computational horsepower -- only means "someone else's data centre" or "the Internet" or "the Web" or "some machine/software/service running in/on someone else's data centre / the Internet / the Web", depending on context.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

First, I think the driving commercial pressures are to satisfy the needs of companies that don't want to manage their own servers, do want to enable secure remote access and in many cases prefer to run SAS applications. AWS, Azure, Google etc do this quite well. These customers would view negatively the prospect of joining their database with one belonging to someone else.

Second, the design and implementation techniques we have are focussed on the business needs of one customer. They assume a single course of truth for definitions, types, constraints. They don't adapt well to a more global view. The RM, SQL, even NoSQL are just not the right place to start.

Companies seeking to aggregate and query across multiple entities have built data warehouses. They use column stores and time series, and lots of conversion tools, and they have their own query languages. Look there if you want some ideas about how the cloud might be globally searchable.

 

 

Andl - A New Database Language - andl.org
Quote from Tegiri Nenashi on June 26, 2019, 9:51 pm

Industry buzzword? I know.

The classical picture of the world (as opposed to QM) describes the entire world as having a single well defined state. Therefore, there should be only one database instance, "the cloud" if you will. Yet, the term is quite nebulous, and in practice it essentially means a bunch of database instances managed by some utility-like company.

Therefore, there is not much innovation in this cloud business. The cloud still divides information into separate parts (individual database instances), which is a problem for at least 3 reasons:
1. It multiplies complexity by management of multiple instances
2. It complicates queries involving data from distinct instances. In practice, direct database access to those instances is never advertised, or even allowed. This sad state of affairs spawned cottage industry of "application integration" tools.
3. It is entirely arbitrary, so in many customer environments the instances are separate for no reason
Cloud databases IMO should amend SQL standard so that instance division becomes transparent. This is just natural progression of the old idea that physical implementation of database engine is isolated from logical layer (SQL). After all a user who queries list of employees doesn't care where on file system Employees table is located. What needs to be done is just extending this idea and eliminating the second physical attribute -- the host name where this database is run.

I'm not sure what is the appropriate venue to nurture these ideas. From what I have seen Cloud Foundry seems to be an organization which focuses on "application as service" with little appreciation of database technology. Those block technology stack diagrams
http://docs.cloudfoundry.org/concepts/architecture/
were never really influential in our field.

More importantly, what would be incentives for various users to share the data? It would be nice if database world had some kind of currency traded for the access to the data...

 

Physical nearness of the data queried still impacts response speed significantly.  So users still ***do*** care "where on file system" data is located.

And in many customer environments the db instances are separate for ***rather poor*** reasons (e.g. the same reasons why the data was silo-ed in the early days of computing according to which application was using them), but that's not the same thing as "no reason".

I don't think "incentives for users to share their data" is the problem.  I think a bigger problem is that authoritative sources sharing their data often will have a legal obligation to verify that the use of their data is justified and proportionate, and another one might be to reach agreement on who exactly the authoritative source is for datum XYZ.

Quote from Erwin on June 27, 2019, 10:13 am
Quote from Tegiri Nenashi on June 26, 2019, 9:51 pm

Industry buzzword? I know.

The classical picture of the world (as opposed to QM) describes the entire world as having a single well defined state. Therefore, there should be only one database instance, "the cloud" if you will. Yet, the term is quite nebulous, and in practice it essentially means a bunch of database instances managed by some utility-like company.

Therefore, there is not much innovation in this cloud business. The cloud still divides information into separate parts (individual database instances), which is a problem for at least 3 reasons:
1. It multiplies complexity by management of multiple instances
2. It complicates queries involving data from distinct instances. In practice, direct database access to those instances is never advertised, or even allowed. This sad state of affairs spawned cottage industry of "application integration" tools.
3. It is entirely arbitrary, so in many customer environments the instances are separate for no reason
Cloud databases IMO should amend SQL standard so that instance division becomes transparent. This is just natural progression of the old idea that physical implementation of database engine is isolated from logical layer (SQL). After all a user who queries list of employees doesn't care where on file system Employees table is located. What needs to be done is just extending this idea and eliminating the second physical attribute -- the host name where this database is run.

I'm not sure what is the appropriate venue to nurture these ideas. From what I have seen Cloud Foundry seems to be an organization which focuses on "application as service" with little appreciation of database technology. Those block technology stack diagrams
http://docs.cloudfoundry.org/concepts/architecture/
were never really influential in our field.

More importantly, what would be incentives for various users to share the data? It would be nice if database world had some kind of currency traded for the access to the data...

 

Physical nearness of the data queried still impacts response speed significantly.  So users still ***do*** care "where on file system" data is located.

And in many customer environments the db instances are separate for ***rather poor*** reasons (e.g. the same reasons why the data was silo-ed in the early days of computing according to which application was using them), but that's not the same thing as "no reason".

I don't think "incentives for users to share their data" is the problem.  I think a bigger problem is that authoritative sources sharing their data often will have a legal obligation to verify that the use of their data is justified and proportionate, and another one might be to reach agreement on who exactly the authoritative source is for datum XYZ.

Indeed. Regulations like the EU's GDPR (General Data Protection Regulation) have certainly shone a spotlight on where and how data is stored, how it's collected, and how it's used. This may have stifled some development in cloud databases.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org