Function relations

#11 · March 31, 2020, 8:03 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#12 · March 31, 2020, 8:23 am

Quote from Erwin on March 31, 2020, 8:23 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Countering that violation means making the spreadsheet inaccessible to excel, and the csv inaccessible to notepad.

Then the "spreadsheet" or the "csv" is a mere physical storage format, and a hopelessly inefficient one for most of the purposes. So why bother ?

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Countering that violation means making the spreadsheet inaccessible to excel, and the csv inaccessible to notepad.

Then the "spreadsheet" or the "csv" is a mere physical storage format, and a hopelessly inefficient one for most of the purposes. So why bother ?

Author of SIRA_PRISE

#13 · March 31, 2020, 9:54 am

Quote from dandl on March 31, 2020, 9:54 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

I've explored most of those possibilities with Andl. I've produce backends with SQLite and Postgres, a protocol layer with Thrift and a direct API layer. It's all perfectly feasible, but it only solves problems most people think are already solved (by SQL).

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

I've explored most of those possibilities with Andl. I've produce backends with SQLite and Postgres, a protocol layer with Thrift and a direct API layer. It's all perfectly feasible, but it only solves problems most people think are already solved (by SQL).

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

Andl - A New Database Language - andl.org

#14 · March 31, 2020, 1:57 pm

Quote from Dave Voorhis on March 31, 2020, 1:57 pm

Quote from dandl on March 31, 2020, 9:54 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

I've explored most of those possibilities with Andl. I've produce backends with SQLite and Postgres, a protocol layer with Thrift and a direct API layer. It's all perfectly feasible, but it only solves problems most people think are already solved (by SQL).

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I use Rel for that and it works well.

But, as we've often discussed before, it could be considerably more agile and I'm still working on that. I've been busy of the last couple of months with paid work and still am, but now working from home frees up (at least) commuting time, so I'll shortly be back to working heavily on my datasheet tool.

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

For the sorts of things that databases and spreadsheets get used for, knowing that your personal data is here on this machine and nowhere else is very compelling, even for those -- like me -- who are otherwise happy to use cloud-based applications.

Quote from dandl on March 31, 2020, 9:54 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

I've explored most of those possibilities with Andl. I've produce backends with SQLite and Postgres, a protocol layer with Thrift and a direct API layer. It's all perfectly feasible, but it only solves problems most people think are already solved (by SQL).

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I use Rel for that and it works well.

But, as we've often discussed before, it could be considerably more agile and I'm still working on that. I've been busy of the last couple of months with paid work and still am, but now working from home frees up (at least) commuting time, so I'll shortly be back to working heavily on my datasheet tool.

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

For the sorts of things that databases and spreadsheets get used for, knowing that your personal data is here on this machine and nowhere else is very compelling, even for those -- like me -- who are otherwise happy to use cloud-based applications.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#15 · March 31, 2020, 2:02 pm

Quote from Dave Voorhis on March 31, 2020, 2:02 pm

Quote from Erwin on March 31, 2020, 8:23 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Countering that violation means making the spreadsheet inaccessible to excel, and the csv inaccessible to notepad.

Then the "spreadsheet" or the "csv" is a mere physical storage format, and a hopelessly inefficient one for most of the purposes. So why bother ?

Because in every organisation I've worked, and for much publicly-available data, those are the formats that 99% of data is provided in. On any given day (and speaking hypothetically but realistically, having had to do almost exactly this) I might need to generate a single number -- like a count of some business-significant thing -- but to generate it I have to access three different DBMSs, an old Access database, seven spreadsheets on three file servers, and two directories each full of CSV files emitted as logs by some machinery. It's quite convenient -- and a nice abstraction -- to represent the lot as tables or relvars and query them easily.

Quote from Erwin on March 31, 2020, 8:23 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

Quote from dandl on March 30, 2020, 11:11 pm

Yes, just to be clear, as long as the goal is to produce a 'super-SQL' then TTM/D is a good try at what it might look like. The arcane type system is a huge hurdle, making it almost impossible to merge with any reasonably common GP language. Tweaking the type system might be the way to go.

It's no worse than SQL, maybe better, and in fact technically very similar, I've found.

A close coupling between, say Tutorial D / Rel and Java with user-defined types is largely infeasible, but a loose coupling with the Java world doing the client-side thing and Rel doing the server-side thing, with a default translation between primitives (RATIONAL <--> double, INTEGER <--> long, BOOLEAN <--> boolean, CHARACTER <--> String) and some custom translations for a limited set of universal user-defined types (DATE <--> Date, DECIMAL <--> BigDecimal) is reasonable.

But my goal here is quite different. The SQL DQL/DML is extraordinarily widely used as glue for application code to get access to data, but it fails if the data isn't in an RDBMS. You can't even do an SQL query on data in a spreadsheet, which much of it seems to be.

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

Microsoft Access does this quite well, and Oracle Database has some facilities for connecting to other DBMSs, but I'm not aware of the facility existing in other popular SQL DBMSs. Maybe it's there and I never noticed?

I imagine handling updates to external tables in a transactional and performant way is a hurdle. Rel supports external relvars of various kinds (a feature I use constantly), but they're mainly read-only.

But then there are SQL-like non-DBMS things that do this, like PartiQL (see https://aws.amazon.com/blogs/opensource/announcing-partiql-one-query-language-for-all-your-data/), which I think has been mentioned here before.

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Countering that violation means making the spreadsheet inaccessible to excel, and the csv inaccessible to notepad.

Then the "spreadsheet" or the "csv" is a mere physical storage format, and a hopelessly inefficient one for most of the purposes. So why bother ?

Because in every organisation I've worked, and for much publicly-available data, those are the formats that 99% of data is provided in. On any given day (and speaking hypothetically but realistically, having had to do almost exactly this) I might need to generate a single number -- like a count of some business-significant thing -- but to generate it I have to access three different DBMSs, an old Access database, seven spreadsheets on three file servers, and two directories each full of CSV files emitted as logs by some machinery. It's quite convenient -- and a nice abstraction -- to represent the lot as tables or relvars and query them easily.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#16 · March 31, 2020, 4:26 pm

Quote from Erwin on March 31, 2020, 8:23 am

Quote from Dave Voorhis on March 31, 2020, 8:03 am

I'm not sure why the popular SQL implementations haven't made more of an effort to support "external tables" (for lack of a better term) where an "external table" looks like any other table from within the DBMS but as actually a connection to an external spreadsheet, CSV file, other DBMS, or whatever.

FWIW, not only do Oracle and MS SQL Server allow this, so does SQLite. SQLite virtual table types include CSV, a table with one column whose values are the natural numbers (for joins, I suppose), a table containing the data and metadata of the files in a zip archive, and many others. Some types are exposed as a table-valued function, some as virtual tables, and some as both.

SQLite is also the only DBMS I have ever been able to go from complete ignorance (but knowing SQL) to full usability in less than a day. It is now the one I reach for for any sort of personal project.

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Codd's rules only apply to the RDBMS, not any tools external to it. You can use backup and restore to subvert transactions, or a (binary) editor to alter data, or any number of other things. Rule 12 applies specifically to row-at-a-time languages that are part of the DBMS.

#17 · March 31, 2020, 5:23 pm

Quote from johnwcowan on March 31, 2020, 4:26 pm

The system overall (that is, including the spreadsheet editor itself , or whatever stupid text editor is used to edit the csv directly) violates rule 12 by definition.

Codd's rules only apply to the RDBMS, not any tools external to it. You can use backup and restore to subvert transactions, or a (binary) editor to alter data, or any number of other things. Rule 12 applies specifically to row-at-a-time languages that are part of the DBMS.

That is why I said "the system overall". If people still need to ***manage*** the "external" data, then the management tool at hand (excel, notepad++, ...) becomes part and parcel of the overall system used to manage ***all*** the data. Meaning it's supposed to enforce type constraints (they already fail at that point as good as all of them - an XML editor working with fully detailed XSD might be the exception) and all the more complex constraints as well. And it's not because they are [typically] regarded as "just editor tools" that they don't have a data manipulation language just as well, even if that language consists of nothing more than [some means for transparently invoking] the underlying OS calls readBytes() and writeBytes().

And Codd's rules apply to any system used for managing data[bases] (aka DBMS), not only the 'R' ones. Indeed, their very aim was to offer a yardstick to distinguish the 'R' ones from the others. So if they could only be applied to the 'R' ones, what was there to distinguish ?

Author of SIRA_PRISE

#18 · March 31, 2020, 6:18 pm

Quote from Erwin on March 31, 2020, 5:23 pm

That is why I said "the system overall". If people still need to ***manage*** the "external" data, then the management tool at hand (excel, notepad++, ...) becomes part and parcel of the overall system used to manage ***all*** the data. Meaning it's supposed to enforce type constraints (they already fail at that point as good as all of them - an XML editor working with fully detailed XSD might be the exception) and all the more complex constraints as well. And it's not because they are [typically] regarded as "just editor tools" that they don't have a data manipulation language just as well, even if that language consists of nothing more than [some means for transparently invoking] the underlying OS calls readBytes() and writeBytes().

The only way to achieve that would be to place the data in a capability-protected store such that the DBMS software, and only the DBMS software, can access it, nothing generic.

But in any case this is not what Rule 12 says.

And Codd's rules apply to any system used for managing data[bases] (aka DBMS), not only the 'R' ones. Indeed, their very aim was to offer a yardstick to distinguish the 'R' ones from the others. So if they could only be applied to the 'R' ones, what was there to distinguish ?

Yes: an RDBMS must satisfy rule 12. But that is not saying what the RDBMS is. "Possession is nine points of the law, but that is not saying how many points the law might have." --Thomas Cowan

#19 · March 31, 2020, 6:25 pm

Not quoting anything because this is kind of a general comment following from the preceding ones, rather than a specific reply.

There are two distinct and equally reasonable applications of the relational model. One is to build DBMSs, for which Codd's Rules and TTM pre-/pro-scriptions should apply, and the other is to build query systems that aren't DBMSs. It's reasonable to create useful implementations of the relational model that -- for sensible, pragmatic, predictable and logical reasons -- don't follow Codd's Rules and/or aren't a D.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#20 · April 1, 2020, 5:04 am

Quote from dandl on April 1, 2020, 5:04 am

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I use Rel for that and it works well.

I don't want to harp on about it, but the lack of documentation and the various little UI 'oddities' have rather put me off. I suspect your familiarity is a good part of why it works well for you. [I might just post a question about that...]

But, as we've often discussed before, it could be considerably more agile and I'm still working on that. I've been busy of the last couple of months with paid work and still am, but now working from home frees up (at least) commuting time, so I'll shortly be back to working heavily on my datasheet tool.

Every cloud...

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

For the sorts of things that databases and spreadsheets get used for, knowing that your personal data is here on this machine and nowhere else is very compelling, even for those -- like me -- who are otherwise happy to use cloud-based applications.

I very much like the idea of a desktop app that shares to the cloud, rather than something that lives only in the cloud. I do email, calendar, OneDrive and OneNote that way, and it works well enough. But app development is still broken, IMO. I've used my NAXL project to update my skills on JS, and I rather like the React UI model. But I don't like the language or running in a browser, that's too high a price.

For better or worse there has been quite a move away from Access-like products, and client-side SQL generally. Most server-side products do stuff like this, but servers make life hard. If I've got a collection of interesting data tables with a mix of CSV, XLS and maybe the odd remote API to manipulate, I don't necessarily want to install and configure Oracle as my first step. Let's say in these current days of market gyrations I'd like to do a bit of analysis on my share portfolio. Is my first thought to install and configure SQL Server? Maybe not, I'm much more likely to reach for Excel and endure the pain.

I use Rel for that and it works well.

I don't want to harp on about it, but the lack of documentation and the various little UI 'oddities' have rather put me off. I suspect your familiarity is a good part of why it works well for you. [I might just post a question about that...]

But, as we've often discussed before, it could be considerably more agile and I'm still working on that. I've been busy of the last couple of months with paid work and still am, but now working from home frees up (at least) commuting time, so I'll shortly be back to working heavily on my datasheet tool.

Every cloud...

Starting from scratch, estimate how long it would take me to get the first useful result from AWS? Days, probably. I could use one of these no-code platforms (one of my associates uses Knack, another uses AirTable) but the learning curve is steep, and then they have my data. There's something very reassuring about desktop apps.

For the sorts of things that databases and spreadsheets get used for, knowing that your personal data is here on this machine and nowhere else is very compelling, even for those -- like me -- who are otherwise happy to use cloud-based applications.

I very much like the idea of a desktop app that shares to the cloud, rather than something that lives only in the cloud. I do email, calendar, OneDrive and OneNote that way, and it works well enough. But app development is still broken, IMO. I've used my NAXL project to update my skills on JS, and I rather like the React UI model. But I don't like the language or running in a browser, that's too high a price.

Andl - A New Database Language - andl.org

The Forum for Discussion about The Third Manifesto and Related Matters