The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

The future of TTM

PreviousPage 4 of 8Next
Quote from Dave Voorhis on July 26, 2021, 8:40 am
Quote from Dave Voorhis on July 26, 2021, 8:29 am
Quote from Darren Duncan on July 26, 2021, 8:14 am
Quote from Dave Voorhis on July 26, 2021, 7:12 am

What I mean is that the world is not yet ready to embrace a MyTTM instead of MySQL, or a TTM Server instead of SQL Server, or a PostgreTTM instead of PostgreSQL, or a TTMLite instead of SQLite, etc.

Can we please say "D" rather than "TTM" in any discussions when we're substituting for "SQL" in cases like the above.  To someone who knows what we're talking about your choice above reads terribly.

 

I... Guess...

D is only one letter. I was going for a recognisable three-letter substitute for maximum relevant equivalence.

Though whilst I'm thinking of it... DLite is a great name, and I'm going to create it just because of that name.

Assuming there aren't a dozen projects/products already using it, of course.

There is no point using TTM. It has no mind share. And D is not much better, on its own.

DLite and DDLite are taken (by products, not just the domain). Best I could come up with is RelDLite.

Andl - A New Database Language - andl.org
Quote from dandl on July 26, 2021, 1:05 pm
Quote from Dave Voorhis on July 26, 2021, 8:40 am
Quote from Dave Voorhis on July 26, 2021, 8:29 am
Quote from Darren Duncan on July 26, 2021, 8:14 am
Quote from Dave Voorhis on July 26, 2021, 7:12 am

What I mean is that the world is not yet ready to embrace a MyTTM instead of MySQL, or a TTM Server instead of SQL Server, or a PostgreTTM instead of PostgreSQL, or a TTMLite instead of SQLite, etc.

Can we please say "D" rather than "TTM" in any discussions when we're substituting for "SQL" in cases like the above.  To someone who knows what we're talking about your choice above reads terribly.

 

I... Guess...

D is only one letter. I was going for a recognisable three-letter substitute for maximum relevant equivalence.

Though whilst I'm thinking of it... DLite is a great name, and I'm going to create it just because of that name.

Assuming there aren't a dozen projects/products already using it, of course.

There is no point using TTM. It has no mind share. And D is not much better, on its own.

The essential ideas are strong. They're what matter, even if their application isn't a D per se, or a SQL substitute.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on July 26, 2021, 12:00 pm
Quote from dandl on July 26, 2021, 10:55 am
Quote from Dave Voorhis on July 26, 2021, 7:12 am
Quote from dandl on July 26, 2021, 1:29 am
Quote from Dave Voorhis on July 23, 2021, 9:19 am
Quote from dandl on July 23, 2021, 4:33 am

I think it would be fair to say that the original goals of TTM, however expressed, are unlikely to be realised. SQL is centre stage, and probably has been since about the mid 1990s. So quo vadis TTM?

For managing and accessing a shared corporate application database, SQL serves well. For business application development accessing a shared database, modern GP languages such as Java and C# are sufficient. Despite the need for an ORM and 'glue' code the combination is good enough, and no-one is looking to replace it. For everything else, it has (or should have) competition.

Features that get me looking for something other than SQL+GP include:

  • embedded or in-process database (no server, not shared)
  • non-business data eg images, audio, time series, geographic, real-time
  • non-database data eg file hierarchies, CSV files, spreadsheets, documents
  • anytime setting up and maintaining a server does not seem feasible/justified
  • anytime writing SQL plus ORM/ODBC glue code does not seem feasible/justified.

There is competition, including:

  • SQLite: in-process, but still SQL+ORM
  • NoSQL: but you don't get relational queries
  • LINQ/Java Streams: but you don't get update
  • bare metal raw files plus various libraries.

So what does TTM offer? Stripping out the new language (which no-one seems to want), the main features seem to be:

  • A type system that can be applied to attributes
  • An in-language DQL based on the RA
  • Updates based on relational assignment
  • Database features including transactions and constraints.

So the addressable market is everyone using data that is not a good match for SQL+GP (because of any of the features listed above), especially those who are already using SQLite or NoSQL or raw files, and who would benefit from any of the TTM features. That's huge.

I agree. There are undoubtedly a vast number of tools that can be written -- in and/or for any number of different languages, platforms, and environments -- which embody one or more of the TTM & relational model & relational algebra ideas. SQL itself isn't yet ready to be replaced -- and promoting products solely on being faithful to the relational model doesn't work as nobody but us cares and the general assumption is that relational == SQL -- but languages, platforms, tools, utilities, you-name-it that do something good (i.e., that save/earn money and/or save effort and/or improve quality) and are based on TTM ideas even if not explicitly mentioned as such in the brochure?

Absolutely.

SQL is absolutely ready to be replaced and is being replaced for many of the above use cases. The stand-out is NoSQL, as the name implies. Client-side LINQ and Java Streams are being used to replace server-side SQL. Countless developers solving problems with the features listed above are cussing SQL and wishing for something better very day.

Yes people don't care about 'faithful' but they certainly care about solving their problems. What user/developer problem does TTM solve?  The RM/RA are well known (ask Google) and SQL is a pain. For some X the pitch is:

  • solves problem X better than SQL
  • does absolutely everything you can do with X in SQL but better/faster/easier/cheaper
  • based on the RA/RM.

Now you have something to sell (for some X).

I'm going to delete every mention of a shared database as being off-topic.

What I mean is that the world is not yet ready to embrace a MyTTM instead of MySQL, or a TTM Server instead of SQL Server, or a PostgreTTM instead of PostgreSQL, or a TTMLite instead of SQLite, etc.

I think it is, if the specific (in-process) benefits are there: quicker, faster, easier, no SQL.

Where NoSQL succeeds -- and isn't just a junior developer's choice over SQL because he's trying to avoid the stuff he found hard in college -- is where the relational model really doesn't apply, at least as we know it. Requirements for per-record dynamicity or document indexing -- where MongoDB or Cassandra or whatever are sound choices -- don't really suit TTM approaches or vice versa.

No, the big reason NoSQL gets over-used is you get to write to local in-process storage in a single language, without having to fuss with servers or SQL. You find out how bad the query language is later.

But when it comes to orchestrating containers or cloud services, or supporting machine learning, or providing custom data analytics, or supporting specialist operations in numerous vertical market niches, there is ample opportunity to produce new tools based on TTM principles and the relational model. They might not look anything like conventional DBMSs, but that's fine -- SQL can have them (and vice versa), because there are plenty of non-conventional possibly-non-database things that can benefit from TTM approaches.

As for LINQ and Streams being used to replace server-side SQL, that's often a junior developer's mistake. Indeed, one of the things I've been paid to do is optimise systems where developers have tried to retrieve multiple large tables from the database repeatedly (SELECT * FROM table;), and then (badly) manipulate them on the client side with Streams/LINQ/record-by-record, when they should be sending one query to the DBMS server side and letting it do the heavy lifting. Then LINQ/Streams become nice -- and appropriate -- tools to helpfully use the query result.

No, I use LINQ a lot and I almost never touch a shared database. LINQ gives you modest in-process SQL-like capabilities on in-process data, CSV files, XLS files, JSON, etc with no servers and no SQL.

The message is: for shared database stick to SQL; for key-value storage try NoSQL and good luck; but for everything else there is Relate-A-Lot (or whatever catchy name you can come up with).

No, for everything else there is LINQ and Streams.

You don't need Relate-A-Lot to replace LINQ and Streams because there is LINQ and Streams.

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

Andl - A New Database Language - andl.org
Quote from dandl on July 27, 2021, 12:42 am
Quote from Dave Voorhis on July 26, 2021, 12:00 pm
Quote from dandl on July 26, 2021, 10:55 am
Quote from Dave Voorhis on July 26, 2021, 7:12 am
Quote from dandl on July 26, 2021, 1:29 am
Quote from Dave Voorhis on July 23, 2021, 9:19 am
Quote from dandl on July 23, 2021, 4:33 am

I think it would be fair to say that the original goals of TTM, however expressed, are unlikely to be realised. SQL is centre stage, and probably has been since about the mid 1990s. So quo vadis TTM?

For managing and accessing a shared corporate application database, SQL serves well. For business application development accessing a shared database, modern GP languages such as Java and C# are sufficient. Despite the need for an ORM and 'glue' code the combination is good enough, and no-one is looking to replace it. For everything else, it has (or should have) competition.

Features that get me looking for something other than SQL+GP include:

  • embedded or in-process database (no server, not shared)
  • non-business data eg images, audio, time series, geographic, real-time
  • non-database data eg file hierarchies, CSV files, spreadsheets, documents
  • anytime setting up and maintaining a server does not seem feasible/justified
  • anytime writing SQL plus ORM/ODBC glue code does not seem feasible/justified.

There is competition, including:

  • SQLite: in-process, but still SQL+ORM
  • NoSQL: but you don't get relational queries
  • LINQ/Java Streams: but you don't get update
  • bare metal raw files plus various libraries.

So what does TTM offer? Stripping out the new language (which no-one seems to want), the main features seem to be:

  • A type system that can be applied to attributes
  • An in-language DQL based on the RA
  • Updates based on relational assignment
  • Database features including transactions and constraints.

So the addressable market is everyone using data that is not a good match for SQL+GP (because of any of the features listed above), especially those who are already using SQLite or NoSQL or raw files, and who would benefit from any of the TTM features. That's huge.

I agree. There are undoubtedly a vast number of tools that can be written -- in and/or for any number of different languages, platforms, and environments -- which embody one or more of the TTM & relational model & relational algebra ideas. SQL itself isn't yet ready to be replaced -- and promoting products solely on being faithful to the relational model doesn't work as nobody but us cares and the general assumption is that relational == SQL -- but languages, platforms, tools, utilities, you-name-it that do something good (i.e., that save/earn money and/or save effort and/or improve quality) and are based on TTM ideas even if not explicitly mentioned as such in the brochure?

Absolutely.

SQL is absolutely ready to be replaced and is being replaced for many of the above use cases. The stand-out is NoSQL, as the name implies. Client-side LINQ and Java Streams are being used to replace server-side SQL. Countless developers solving problems with the features listed above are cussing SQL and wishing for something better very day.

Yes people don't care about 'faithful' but they certainly care about solving their problems. What user/developer problem does TTM solve?  The RM/RA are well known (ask Google) and SQL is a pain. For some X the pitch is:

  • solves problem X better than SQL
  • does absolutely everything you can do with X in SQL but better/faster/easier/cheaper
  • based on the RA/RM.

Now you have something to sell (for some X).

I'm going to delete every mention of a shared database as being off-topic.

What I mean is that the world is not yet ready to embrace a MyTTM instead of MySQL, or a TTM Server instead of SQL Server, or a PostgreTTM instead of PostgreSQL, or a TTMLite instead of SQLite, etc.

I think it is, if the specific (in-process) benefits are there: quicker, faster, easier, no SQL.

Where NoSQL succeeds -- and isn't just a junior developer's choice over SQL because he's trying to avoid the stuff he found hard in college -- is where the relational model really doesn't apply, at least as we know it. Requirements for per-record dynamicity or document indexing -- where MongoDB or Cassandra or whatever are sound choices -- don't really suit TTM approaches or vice versa.

No, the big reason NoSQL gets over-used is you get to write to local in-process storage in a single language, without having to fuss with servers or SQL. You find out how bad the query language is later.

But when it comes to orchestrating containers or cloud services, or supporting machine learning, or providing custom data analytics, or supporting specialist operations in numerous vertical market niches, there is ample opportunity to produce new tools based on TTM principles and the relational model. They might not look anything like conventional DBMSs, but that's fine -- SQL can have them (and vice versa), because there are plenty of non-conventional possibly-non-database things that can benefit from TTM approaches.

As for LINQ and Streams being used to replace server-side SQL, that's often a junior developer's mistake. Indeed, one of the things I've been paid to do is optimise systems where developers have tried to retrieve multiple large tables from the database repeatedly (SELECT * FROM table;), and then (badly) manipulate them on the client side with Streams/LINQ/record-by-record, when they should be sending one query to the DBMS server side and letting it do the heavy lifting. Then LINQ/Streams become nice -- and appropriate -- tools to helpfully use the query result.

No, I use LINQ a lot and I almost never touch a shared database. LINQ gives you modest in-process SQL-like capabilities on in-process data, CSV files, XLS files, JSON, etc with no servers and no SQL.

The message is: for shared database stick to SQL; for key-value storage try NoSQL and good luck; but for everything else there is Relate-A-Lot (or whatever catchy name you can come up with).

No, for everything else there is LINQ and Streams.

You don't need Relate-A-Lot to replace LINQ and Streams because there is LINQ and Streams.

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

I think SQLite won that race.

But for in-memory data manipulation, the container/collection filtering, mapping, folding for which you might have used a TTM-inspired relational algebra before LINQ and Streams existed, there is now LINQ and Streams. They won that race.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

Orchestration: Cloud service types are represented as relvars; cloud images, instances and services are modelled via tuples. Manipulate swarms of resources using the relational model.

Analytics: Perform ETL with relational tools -- all manner of different kinds of data sources are visible as relvars. Query them and integrate them using a relational algebra; emit them in standard formats for consumption by analytics tools.

Analytics: What if something like R or Julia, or a Python library, was a modern D?

Analytics: What would something like SAS or SPSS/X look like as a modern D?

Analytics: Find a vertical market that does some standard, industry-specific processing. Implement it according to TTM principles.

Orchestration analytics: Cloud service metrics are available as relvars. Manipulate and present cloud infrastructure stats using the relational model.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

See my orchestration and analytics examples above.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

I think SQLite won that race.

It's not a race. It's a whole series of user/developer problems for which SQLite was a better solution than what came before (mostly ISAM I guess). SQLite brings its own set of problems (foreign language API, glue code, ORM, limited SQL types, SQL coding, injection, etc). At a guess 10% of SQLite users would use something else, but there isn't anything else. Plus a whole bunch of NoSQL users who really are dealing with tables but already chose not to use SQLite.

But for in-memory data manipulation, the container/collection filtering, mapping, folding for which you might have used a TTM-inspired relational algebra before LINQ and Streams existed, there is now LINQ and Streams. They won that race.

True, but  that doesn't give you a database, updates, transactions, etc.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

Orchestration: Cloud service types are represented as relvars; cloud images, instances and services are modelled via tuples. Manipulate swarms of resources using the relational model.

No, I don't get that one. Why would you treat these things as a data model rather than an API?

Analytics: Perform ETL with relational tools -- all manner of different kinds of data sources are visible as relvars. Query them and integrate them using a relational algebra; emit them in standard formats for consumption by analytics tools.

This one I get. SQL is often used for ETL and it's often not a great fit.

Analytics: What if something like R or Julia, or a Python library, was a modern D?

Analytics: What would something like SAS or SPSS/X look like as a modern D?

I had a go at this with Knime, and there is a problem: the relational model is not a good fit. There is a big focus on data cleaning, missing values and the interplay between qualitative and quantitative measures. Much of the data is time-ordered and the stats are things like moving average or rate of change You think it would be nice to join this table from this source with that table from that source, but then you find out the formats or units or semantics are incompatible. There are tools to help with that, but not by seeing them as relations.

Analytics: Find a vertical market that does some standard, industry-specific processing. Implement it according to TTM principles.

Orchestration analytics: Cloud service metrics are available as relvars. Manipulate and present cloud infrastructure stats using the relational model.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

See my orchestration and analytics examples above.

They fall short of being specific concrete user problems, suited to a relational approach but not suited to SQL.

Andl - A New Database Language - andl.org
Quote from dandl on July 28, 2021, 1:35 pm

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

I think SQLite won that race.

It's not a race. It's a whole series of user/developer problems for which SQLite was a better solution than what came before (mostly ISAM I guess). SQLite brings its own set of problems (foreign language API, glue code, ORM, limited SQL types, SQL coding, injection, etc). At a guess 10% of SQLite users would use something else, but there isn't anything else. Plus a whole bunch of NoSQL users who really are dealing with tables but already chose not to use SQLite.

There are a lot of things in that space. Some NoSQL tools are entirely appropriate to certain use cases. Not every use of MongoDB or Cassandra is a case where the relational model would be better. Often, MongoDB or Cassandra are better for what MongoDB or Cassandra are being used to do.

The big player in the lightweight transactional persistence space is the Berkeley DB, which is likely why Oracle bought it from Sleepycat Software in 2006. It's a key-value store. A lot of good things can be done effectively with a key-value store, and higher-level facilities are often built on a key-value store as a core.

But for in-memory data manipulation, the container/collection filtering, mapping, folding for which you might have used a TTM-inspired relational algebra before LINQ and Streams existed, there is now LINQ and Streams. They won that race.

True, but  that doesn't give you a database, updates, transactions, etc.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

Orchestration: Cloud service types are represented as relvars; cloud images, instances and services are modelled via tuples. Manipulate swarms of resources using the relational model.

No, I don't get that one. Why would you treat these things as a data model rather than an API?

It's an API where the fundamental organising principle is the relational model.

Why?

Because it's a consistent representation for everything, with the ability to use JOIN to link references and use relational algebra in general to dice and slice resources.

I've seen a book that treated UNIX administration as if the system was a database and shell scripting were queries. I'll try to recall the title, though it's quite old now. I've seen a utility designed to manage UNIX systems via a SQL-like interface.

Both were quite marvellous, in a quirky, weird, and surprisingly workable way. I suspect they would be even better using a true relational model, and rather than managing one system, manage many.

Analytics: Perform ETL with relational tools -- all manner of different kinds of data sources are visible as relvars. Query them and integrate them using a relational algebra; emit them in standard formats for consumption by analytics tools.

This one I get. SQL is often used for ETL and it's often not a great fit.

Analytics: What if something like R or Julia, or a Python library, was a modern D?

Analytics: What would something like SAS or SPSS/X look like as a modern D?

I had a go at this with Knime, and there is a problem: the relational model is not a good fit. There is a big focus on data cleaning, missing values and the interplay between qualitative and quantitative measures. Much of the data is time-ordered and the stats are things like moving average or rate of change You think it would be nice to join this table from this source with that table from that source, but then you find out the formats or units or semantics are incompatible. There are tools to help with that, but not by seeing them as relations.

Is that a fundamental problem with the relational model?

Or just a particular application of it?

Analytics: Find a vertical market that does some standard, industry-specific processing. Implement it according to TTM principles.

Orchestration analytics: Cloud service metrics are available as relvars. Manipulate and present cloud infrastructure stats using the relational model.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

See my orchestration and analytics examples above.

They fall short of being specific concrete user problems, suited to a relational approach but not suited to SQL.

There are problems to be solved everywhere. Whether they can or can't be solved with a relational approach is for you to decide. I think there is plenty of application for TTM and relational ideas in solving them, but that requires thinking rather outside the just-a-replacement-for-SQL or here's-a-new-language-for-sale box.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on July 28, 2021, 2:10 pm
Quote from dandl on July 28, 2021, 1:35 pm

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

I think SQLite won that race.

It's not a race. It's a whole series of user/developer problems for which SQLite was a better solution than what came before (mostly ISAM I guess). SQLite brings its own set of problems (foreign language API, glue code, ORM, limited SQL types, SQL coding, injection, etc). At a guess 10% of SQLite users would use something else, but there isn't anything else. Plus a whole bunch of NoSQL users who really are dealing with tables but already chose not to use SQLite.

There are a lot of things in that space. Some NoSQL tools are entirely appropriate to certain use cases. Not every use of MongoDB or Cassandra is a case where the relational model would be better. Often, MongoDB or Cassandra are better for what MongoDB or Cassandra are being used to do.

The big player in the lightweight transactional persistence space is the Berkeley DB, which is likely why Oracle bought it from Sleepycat Software in 2006. It's a key-value store. A lot of good things can be done effectively with a key-value store, and higher-level facilities are often built on a key-value store as a core.

I'm with you on that one. I've never used Berkeley DB but it's the best candidate I know for the underlying in-process transactional store. I've used c-tree and Btrieve and I've written my own ISAM (twice), but I don't want to write another one. And it's already got lots of the low level tools.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

Orchestration: Cloud service types are represented as relvars; cloud images, instances and services are modelled via tuples. Manipulate swarms of resources using the relational model.

No, I don't get that one. Why would you treat these things as a data model rather than an API?

It's an API where the fundamental organising principle is the relational model.

Why?

Because it's a consistent representation for everything, with the ability to use JOIN to link references and use relational algebra in general to dice and slice resources.

I've seen a book that treated UNIX administration as if the system was a database and shell scripting were queries. I'll try to recall the title, though it's quite old now. I've seen a utility designed to manage UNIX systems via a SQL-like interface.

Both were quite marvellous, in a quirky, weird, and surprisingly workable way. I suspect they would be even better using a true relational model, and rather than managing one system, manage many.

And that really is the point: the data model. One of the reasons SQL is so dominant is DDL, and that's a problem TTM seems to ignore (in practice you can get there by writing code).

So it should be possible to:

  • create or acquire an abstract data model
  • express it in code in the host language with key and constraints (in Java use records, not tuples, and lambdas)
  • create relvars as needed (in Java use generics parameterised on the record definition)
  • create relcons as needed (in Java use lambdas)
  • place relvars in a database or connect them to external providers
  • use a library of Extended RA operators for queries and updates (*)
  • generate basic CRUD code for each relvar

All of this is within the capabilities of existing GP languages, except for (*) heading/type inference across RA operators. This requires a pre-processor, modified compiler, meta-programming or runtime code.

But the data model is the key. If that doesn't look relational, all bets are off.

Analytics: Perform ETL with relational tools -- all manner of different kinds of data sources are visible as relvars. Query them and integrate them using a relational algebra; emit them in standard formats for consumption by analytics tools.

This one I get. SQL is often used for ETL and it's often not a great fit.

Analytics: What if something like R or Julia, or a Python library, was a modern D?

Analytics: What would something like SAS or SPSS/X look like as a modern D?

I had a go at this with Knime, and there is a problem: the relational model is not a good fit. There is a big focus on data cleaning, missing values and the interplay between qualitative and quantitative measures. Much of the data is time-ordered and the stats are things like moving average or rate of change You think it would be nice to join this table from this source with that table from that source, but then you find out the formats or units or semantics are incompatible. There are tools to help with that, but not by seeing them as relations.

Is that a fundamental problem with the relational model?

Or just a particular application of it?

The relational model is not a good fit for continuous or time-series data, or anything where order matters. The RM does not define a type system, but the TTM type system is not a good fit for missing values. It's particularly a problem for aggregation (of all kinds).

Analytics: Find a vertical market that does some standard, industry-specific processing. Implement it according to TTM principles.

Orchestration analytics: Cloud service metrics are available as relvars. Manipulate and present cloud infrastructure stats using the relational model.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

See my orchestration and analytics examples above.

They fall short of being specific concrete user problems, suited to a relational approach but not suited to SQL.

There are problems to be solved everywhere. Whether they can or can't be solved with a relational approach is for you to decide. I think there is plenty of application for TTM and relational ideas in solving them, but that requires thinking rather outside the just-a-replacement-for-SQL or here's-a-new-language-for-sale box.

Not sure that helps. The prerequisites are a relational model and some data. If we're going to store it then it's hard to ignore SQL. If not, where is it stored? Who has that kind of data and that kind of problem? Sounds like a solution in search of a problem, and that rarely works out well.

Andl - A New Database Language - andl.org
Quote from dandl on July 29, 2021, 1:45 am
Quote from Dave Voorhis on July 28, 2021, 2:10 pm
Quote from dandl on July 28, 2021, 1:35 pm

That's just plain silly. The contest is in-process database with SQL-like features: durable multi-table storage, strong query language, transactions as the baseline. Linq/Streams are not even in the race.

I think SQLite won that race.

It's not a race. It's a whole series of user/developer problems for which SQLite was a better solution than what came before (mostly ISAM I guess). SQLite brings its own set of problems (foreign language API, glue code, ORM, limited SQL types, SQL coding, injection, etc). At a guess 10% of SQLite users would use something else, but there isn't anything else. Plus a whole bunch of NoSQL users who really are dealing with tables but already chose not to use SQLite.

There are a lot of things in that space. Some NoSQL tools are entirely appropriate to certain use cases. Not every use of MongoDB or Cassandra is a case where the relational model would be better. Often, MongoDB or Cassandra are better for what MongoDB or Cassandra are being used to do.

The big player in the lightweight transactional persistence space is the Berkeley DB, which is likely why Oracle bought it from Sleepycat Software in 2006. It's a key-value store. A lot of good things can be done effectively with a key-value store, and higher-level facilities are often built on a key-value store as a core.

I'm with you on that one. I've never used Berkeley DB but it's the best candidate I know for the underlying in-process transactional store. I've used c-tree and Btrieve and I've written my own ISAM (twice), but I don't want to write another one. And it's already got lots of the low level tools.

But that doesn't mean TTM ideas don't have a role. They do. For orchestrating containers or cloud services, supporting machine learning, providing custom data analytics, or supporting specialist operations in numerous vertical market niches. (I have a little, tiny bit of insider knowledge here... This is a big set of markets, and if TTM fans like us don't step up to the plate, we're going to get beaten to it by the Datalog fans.)

Feel free to explain further. I have no idea how 'orchestration' and 'analytics' fit in there.

Orchestration: Cloud service types are represented as relvars; cloud images, instances and services are modelled via tuples. Manipulate swarms of resources using the relational model.

No, I don't get that one. Why would you treat these things as a data model rather than an API?

It's an API where the fundamental organising principle is the relational model.

Why?

Because it's a consistent representation for everything, with the ability to use JOIN to link references and use relational algebra in general to dice and slice resources.

I've seen a book that treated UNIX administration as if the system was a database and shell scripting were queries. I'll try to recall the title, though it's quite old now. I've seen a utility designed to manage UNIX systems via a SQL-like interface.

Both were quite marvellous, in a quirky, weird, and surprisingly workable way. I suspect they would be even better using a true relational model, and rather than managing one system, manage many.

And that really is the point: the data model. One of the reasons SQL is so dominant is DDL, and that's a problem TTM seems to ignore (in practice you can get there by writing code).

So it should be possible to:

  • create or acquire an abstract data model
  • express it in code in the host language with key and constraints (in Java use records, not tuples, and lambdas)

Aside: Java 14 and above provides the new record construct. Most production Java is either Java 11 (new) or Java 8 (legacy), these being the main long-term-support (LTS) versions in the field. Java 17 is LTS but won't be out until September 2021. In short, that means we'll be using (tuple) classes rather than records for a while, except for those production projects fortunate enough to adopt Java 17. Lambdas (from Java 8) are fine.

  • create relvars as needed (in Java use generics parameterised on the record definition)
  • create relcons as needed (in Java use lambdas)
  • place relvars in a database or connect them to external providers
  • use a library of Extended RA operators for queries and updates (*)
  • generate basic CRUD code for each relvar

All of this is within the capabilities of existing GP languages, except for (*) heading/type inference across RA operators. This requires a pre-processor, modified compiler, meta-programming or runtime code.

But the data model is the key. If that doesn't look relational, all bets are off.

The modern Java approach is to do all of that, except the operators are Java Streams operators rather than RA operators. Thus, no pre-processor, modified compiler, meta-programming, or special runtime code is required. You get a certain semantic equivalence to most of a typical relational algebra except for ad-hoc JOIN.

Ad-hoc JOIN is generally deprecated for performance reasons, in favour of maintaining an object graph of (effectively) pre-joined relationships between instances. In a typical running object-oriented program, that exists by default as a result of being a running object-oriented program.

But if you're representing external data in your running object-oriented program, then either an object graph needs to be created and/or an ad-hoc JOIN Streams-compatible operator needs to be provided. In a previous discussion, I posted an illustrative example of a usable (though highly inefficient) ad-hoc JOIN implementation that works with Java Streams.

Analytics: Perform ETL with relational tools -- all manner of different kinds of data sources are visible as relvars. Query them and integrate them using a relational algebra; emit them in standard formats for consumption by analytics tools.

This one I get. SQL is often used for ETL and it's often not a great fit.

Analytics: What if something like R or Julia, or a Python library, was a modern D?

Analytics: What would something like SAS or SPSS/X look like as a modern D?

I had a go at this with Knime, and there is a problem: the relational model is not a good fit. There is a big focus on data cleaning, missing values and the interplay between qualitative and quantitative measures. Much of the data is time-ordered and the stats are things like moving average or rate of change You think it would be nice to join this table from this source with that table from that source, but then you find out the formats or units or semantics are incompatible. There are tools to help with that, but not by seeing them as relations.

Is that a fundamental problem with the relational model?

Or just a particular application of it?

The relational model is not a good fit for continuous or time-series data, or anything where order matters. The RM does not define a type system, but the TTM type system is not a good fit for missing values. It's particularly a problem for aggregation (of all kinds).

These sound like opportunities for further R&D.

I think missing values are effectively handled with option types, but I know not everyone agrees so practical implementations might provide multiple ways to handle missing values.

Analytics: Find a vertical market that does some standard, industry-specific processing. Implement it according to TTM principles.

Orchestration analytics: Cloud service metrics are available as relvars. Manipulate and present cloud infrastructure stats using the relational model.

As a replacement for anything any SQL does in a dominant way -- whether shared or single user -- or for LINQ and Streams in general?

No, not really.

Or only in a small way, and you'll kick yourself later for missing out on opportunities in containers or cloud services, supporting machine learning, providing custom data analytics, supporting specialist operations in numerous vertical market niches, and almost certainly dozens, maybe hundreds, of other places I haven't mentioned.

Please name and describe one specific user problem that this might solve, to help us understand why.

See my orchestration and analytics examples above.

They fall short of being specific concrete user problems, suited to a relational approach but not suited to SQL.

There are problems to be solved everywhere. Whether they can or can't be solved with a relational approach is for you to decide. I think there is plenty of application for TTM and relational ideas in solving them, but that requires thinking rather outside the just-a-replacement-for-SQL or here's-a-new-language-for-sale box.

Not sure that helps. The prerequisites are a relational model and some data. If we're going to store it then it's hard to ignore SQL. If not, where is it stored? Who has that kind of data and that kind of problem? Sounds like a solution in search of a problem, and that rarely works out well.

It's a solution in search of a problem if the starting point is trying to fit everything into the relational model. Instead, if the starting point is trying to solve a problem and it turns out that the relational model -- or something similar to it -- provides an effective unifying representation, then TTM ideas may well apply.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

So it should be possible to:

  • create or acquire an abstract data model
  • express it in code in the host language with key and constraints (in Java use records, not tuples, and lambdas)

Aside: Java 14 and above provides the new record construct. Most production Java is either Java 11 (new) or Java 8 (legacy), these being the main long-term-support (LTS) versions in the field. Java 17 is LTS but won't be out until September 2021. In short, that means we'll be using (tuple) classes rather than records for a while, except for those production projects fortunate enough to adopt Java 17. Lambdas (from Java 8) are fine.

Yes, I know. The point here is that

  • exposed records are native Java/C#/C++ or whatever, not a tuple in sight eg in C# struct Customer.
  • Relation is an opaque generic type eg Relation<Customer> with
    • ctor is Relation(IList<Customer>) or similar
    • exposes IEnumerable<Customer> ToList()
    • provides the (E)RA as operations on relations only
    • provides (multiple) assignment and transactions as APIs

[Sorry about the C#, my Java is not strong enough.]

  • create relvars as needed (in Java use generics parameterised on the record definition)
  • create relcons as needed (in Java use lambdas)
  • place relvars in a database or connect them to external providers
  • use a library of Extended RA operators for queries and updates (*)
  • generate basic CRUD code for each relvar

All of this is within the capabilities of existing GP languages, except for (*) heading/type inference across RA operators. This requires a pre-processor, modified compiler, meta-programming or runtime code.

But the data model is the key. If that doesn't look relational, all bets are off.

The modern Java approach is to do all of that, except the operators are Java Streams operators rather than RA operators. Thus, no pre-processor, modified compiler, meta-programming, or special runtime code is required. You get a certain semantic equivalence to most of a typical relational algebra except for ad-hoc JOIN.

Ad-hoc JOIN is generally deprecated for performance reasons, in favour of maintaining an object graph of (effectively) pre-joined relationships between instances. In a typical running object-oriented program, that exists by default as a result of being a running object-oriented program.

But if you're representing external data in your running object-oriented program, then either an object graph needs to be created and/or an ad-hoc JOIN Streams-compatible operator needs to be provided. In a previous discussion, I posted an illustrative example of a usable (though highly inefficient) ad-hoc JOIN implementation that works with Java Streams.

I know, so the benefit has to be in what Streams does not provide: enhanced queries, database, relvars, transactions, constraints etc. All the database stuff.

But you won't find interesting new solutions by looking at how Streams and LINQ are used now. You have to look for problems they couldn't solve, and where the SQL-based solution works but is clunky.

 

Andl - A New Database Language - andl.org
Quote from dandl on July 29, 2021, 1:01 pm

So it should be possible to:

  • create or acquire an abstract data model
  • express it in code in the host language with key and constraints (in Java use records, not tuples, and lambdas)

Aside: Java 14 and above provides the new record construct. Most production Java is either Java 11 (new) or Java 8 (legacy), these being the main long-term-support (LTS) versions in the field. Java 17 is LTS but won't be out until September 2021. In short, that means we'll be using (tuple) classes rather than records for a while, except for those production projects fortunate enough to adopt Java 17. Lambdas (from Java 8) are fine.

Yes, I know. The point here is that

  • exposed records are native Java/C#/C++ or whatever, not a tuple in sight eg in C# struct Customer.
  • Relation is an opaque generic type eg Relation<Customer> with
    • ctor is Relation(IList<Customer>) or similar
    • exposes IEnumerable<Customer> ToList()
    • provides the (E)RA as operations on relations only
    • provides (multiple) assignment and transactions as APIs

[Sorry about the C#, my Java is not strong enough.]

  • create relvars as needed (in Java use generics parameterised on the record definition)
  • create relcons as needed (in Java use lambdas)
  • place relvars in a database or connect them to external providers
  • use a library of Extended RA operators for queries and updates (*)
  • generate basic CRUD code for each relvar

All of this is within the capabilities of existing GP languages, except for (*) heading/type inference across RA operators. This requires a pre-processor, modified compiler, meta-programming or runtime code.

But the data model is the key. If that doesn't look relational, all bets are off.

The modern Java approach is to do all of that, except the operators are Java Streams operators rather than RA operators. Thus, no pre-processor, modified compiler, meta-programming, or special runtime code is required. You get a certain semantic equivalence to most of a typical relational algebra except for ad-hoc JOIN.

Ad-hoc JOIN is generally deprecated for performance reasons, in favour of maintaining an object graph of (effectively) pre-joined relationships between instances. In a typical running object-oriented program, that exists by default as a result of being a running object-oriented program.

But if you're representing external data in your running object-oriented program, then either an object graph needs to be created and/or an ad-hoc JOIN Streams-compatible operator needs to be provided. In a previous discussion, I posted an illustrative example of a usable (though highly inefficient) ad-hoc JOIN implementation that works with Java Streams.

I know, so the benefit has to be in what Streams does not provide: enhanced queries, database, relvars, transactions, constraints etc. All the database stuff.

But you won't find interesting new solutions by looking at how Streams and LINQ are used now. You have to look for problems they couldn't solve, and where the SQL-based solution works but is clunky.

I'm not sure a (hopefully slightly-less-clunky) TTM-based approach to queries, databases, transactions and constraints is enough to overcome the momentum (or maybe it's inertia) of SQLite (or Berkeley DB, or other NoSQL dbs) and LINQ/Streams.

But maybe it is.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
PreviousPage 4 of 8Next