NoSQL and SQL and 'relational' -- a mess of disinformation

#1 · January 15, 2021, 1:03 am

Have the tables turned on NoSQL?

Amazing how much confusion you can sow in a short article. I feel a Fabian moment coming on ...

#2 · January 15, 2021, 12:28 pm

Looks like a student essay.

Rubbage.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#3 · January 28, 2021, 8:44 am

Was it really that bad? Anyway, Google have mostly left NoSQL behind now https://www.infoworld.com/article/3209719/review-google-cloud-spanner-takes-sql-to-nosql-scale.html

#4 · January 28, 2021, 9:08 am

Quote from tobega on January 28, 2021, 8:44 am

Was it really that bad?

It wasn't that bad, it just seemed more like a student essay than mature and insightful commentary. It's the sort of thing I used to mark by the dozens in a yearly tranche of 1^st year introductory "first module in computer science" essays. I'd have given it a B-.

Anyway, Google have mostly left NoSQL behind now https://www.infoworld.com/article/3209719/review-google-cloud-spanner-takes-sql-to-nosql-scale.html

I'm not surprised. The early-NoSQL criticisms about SQL being too slow and unscalable had nothing to do with SQL -- i.e., language parsing -- and everything to do with the underlying DBMS engine and architecture. If you can represent storage as tables, you can put SQL on top of it.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#5 · January 28, 2021, 10:34 am

But moving on...

There are things that really don't fit well into tables.

Every application I ever saw has a 'system' table or a 'config' table with one row and zillions of columns
There are times you just want to store a document with a key to find it again
There are things like survey data or sensor data where storing the data matters and the schema doesn't (and missing or invalid data abounds)
A lot of data is time-sequenced and the schema may vary over time
Sometimes you really don't want to have to fight the server, you just want to store and retrieve the data.

NoSQL has a place, but replacing SQL is not it. Tables and a known schema are the strongest solution to a host of problems, but why aren't we getting better at it?

Toon Koopelaars called it 'window on data'. I've been working on that basic problem now for over 40 years, and it's hard to see we're getting closer to a general solution, or whether SQL/RM is part of the solution or part of the problem. I have written software that solves the following problem on several different generations of computing:

Define a table in terms of column name, data type (about 5-10 choices)
Define one or more keys (indexes)
Generate a database
Generate a program that allows data entry and retrieval (form and tabular)
Generate a program that does bulk data import/export
Generate a report program with simple selection and sorting criteria.

I have failed (so far) to write software that does the apparently very similar same thing except:

the schema is defined as data is entered, and can be changed at any time
the data model consists of multiple tables.

Many people have tackled this problem over time, but I've yet to see a good solution.

Why is it so hard?

Is SQL and/or the RM part of the solution, or part of the problem?

Andl - A New Database Language - andl.org

#6 · January 28, 2021, 12:19 pm

Quote from Dave Voorhis on January 28, 2021, 12:19 pm

Quote from dandl on January 28, 2021, 10:34 am

But moving on...

There are things that really don't fit well into tables.

Every application I ever saw has a 'system' table or a 'config' table with one row and zillions of columns

There are times you just want to store a document with a key to find it again

There are things like survey data or sensor data where storing the data matters and the schema doesn't (and missing or invalid data abounds)

A lot of data is time-sequenced and the schema may vary over time

Sometimes you really don't want to have to fight the server, you just want to store and retrieve the data.

NoSQL has a place, but replacing SQL is not it. Tables and a known schema are the strongest solution to a host of problems, but why aren't we getting better at it?

Toon Koopelaars called it 'window on data'. I've been working on that basic problem now for over 40 years, and it's hard to see we're getting closer to a general solution, or whether SQL/RM is part of the solution or part of the problem. I have written software that solves the following problem on several different generations of computing:

Define a table in terms of column name, data type (about 5-10 choices)

Define one or more keys (indexes)

Generate a database

Generate a program that allows data entry and retrieval (form and tabular)

Generate a program that does bulk data import/export

Generate a report program with simple selection and sorting criteria.

I have failed (so far) to write software that does the apparently very similar same thing except:

the schema is defined as data is entered, and can be changed at any time

the data model consists of multiple tables.

Many people have tackled this problem over time, but I've yet to see a good solution.

Why is it so hard?

It's the same problem that makes all interfacing of systems hard, not just interfacing with database systems: it's the "copy problem" or cache update problem.

Keeping copies of data -- and associated metadata, which are just data -- whether code, configurations, documentation, replicas, you-name-it, up-to-date with a definitive source of truth is inherently difficult. It might be essential complexity.

Is SQL and/or the RM part of the solution, or part of the problem?

I don't think it makes it any worse. It's orthogonal to the problem.

Quote from dandl on January 28, 2021, 10:34 am

But moving on...

There are things that really don't fit well into tables.

Every application I ever saw has a 'system' table or a 'config' table with one row and zillions of columns

There are times you just want to store a document with a key to find it again

There are things like survey data or sensor data where storing the data matters and the schema doesn't (and missing or invalid data abounds)

A lot of data is time-sequenced and the schema may vary over time

Sometimes you really don't want to have to fight the server, you just want to store and retrieve the data.

NoSQL has a place, but replacing SQL is not it. Tables and a known schema are the strongest solution to a host of problems, but why aren't we getting better at it?

Toon Koopelaars called it 'window on data'. I've been working on that basic problem now for over 40 years, and it's hard to see we're getting closer to a general solution, or whether SQL/RM is part of the solution or part of the problem. I have written software that solves the following problem on several different generations of computing:

Define a table in terms of column name, data type (about 5-10 choices)

Define one or more keys (indexes)

Generate a database

Generate a program that allows data entry and retrieval (form and tabular)

Generate a program that does bulk data import/export

Generate a report program with simple selection and sorting criteria.

I have failed (so far) to write software that does the apparently very similar same thing except:

the schema is defined as data is entered, and can be changed at any time

the data model consists of multiple tables.

Many people have tackled this problem over time, but I've yet to see a good solution.

Why is it so hard?

It's the same problem that makes all interfacing of systems hard, not just interfacing with database systems: it's the "copy problem" or cache update problem.

Keeping copies of data -- and associated metadata, which are just data -- whether code, configurations, documentation, replicas, you-name-it, up-to-date with a definitive source of truth is inherently difficult. It might be essential complexity.

Is SQL and/or the RM part of the solution, or part of the problem?

I don't think it makes it any worse. It's orthogonal to the problem.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#7 · January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies
Access: views, fields, search keys, presentation types
Form sisplay/update: forms, fields, search keys
Tabular display/update: grids, fields, search keys
UI Layout: Master/detail, grids/forms, placement (optional)
List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

Andl - A New Database Language - andl.org

#8 · January 29, 2021, 10:52 am

Quote from Dave Voorhis on January 29, 2021, 10:52 am

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

It's not that no one is trying this, it's that everyone is trying this, in hundreds or thousands of different and usually incompatible ways, all at once.

Most business applications have two fundamental endpoints of truth -- some user interface at one end that gets user input and presents system output, and some storage at the other end to maintain state. Everything we do in between is dependent on maintaining synchronisation with those endpoints, or synchronising the endpoints with changes made in between them.

In computer science terms, it's a cache update problem -- a diverse, distributed, and largely asynchronous cache update problem. (Note that I use "cache" here as an analogy and abstraction of the problem. I don't mean a cache per se, though the essential problem is isomorphic to cache update.)

Cache update is hard enough on its own, let alone across multiple languages, timeframes, tools, developers, development approaches, everything.

All the apparent solutions, like everything else in software development, has tradeoffs.

Always tradeoffs.

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

It's not that no one is trying this, it's that everyone is trying this, in hundreds or thousands of different and usually incompatible ways, all at once.

Most business applications have two fundamental endpoints of truth -- some user interface at one end that gets user input and presents system output, and some storage at the other end to maintain state. Everything we do in between is dependent on maintaining synchronisation with those endpoints, or synchronising the endpoints with changes made in between them.

In computer science terms, it's a cache update problem -- a diverse, distributed, and largely asynchronous cache update problem. (Note that I use "cache" here as an analogy and abstraction of the problem. I don't mean a cache per se, though the essential problem is isomorphic to cache update.)

Cache update is hard enough on its own, let alone across multiple languages, timeframes, tools, developers, development approaches, everything.

All the apparent solutions, like everything else in software development, has tradeoffs.

Always tradeoffs.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#9 · January 29, 2021, 12:12 pm

Quote from tobega on January 29, 2021, 12:12 pm

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

I think the short answer is because you always want a little more that is not covered by the framework.

I have built and sold applications built on products like this back in the 80s, I suppose it worked well enough for simple business needs.

In my current assignment we have at least two bespoke frameworks trying to do something like this and everybody hates them. The problem is that it becomes a whole new language to learn before it gets easy to work with and you tend to forget everything before the next time you need to change something. If a bug happens, finding and fixing it is extremely painful.

I don't know if it's inevitable, but so far we always seem to end up with an EAV table, making it harder to verify the data outside the user interface.

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

I think the short answer is because you always want a little more that is not covered by the framework.

I have built and sold applications built on products like this back in the 80s, I suppose it worked well enough for simple business needs.

In my current assignment we have at least two bespoke frameworks trying to do something like this and everybody hates them. The problem is that it becomes a whole new language to learn before it gets easy to work with and you tend to forget everything before the next time you need to change something. If a bug happens, finding and fixing it is extremely painful.

I don't know if it's inevitable, but so far we always seem to end up with an EAV table, making it harder to verify the data outside the user interface.

#10 · January 29, 2021, 3:47 pm

Quote from Dave Voorhis on January 29, 2021, 3:47 pm

Quote from tobega on January 29, 2021, 12:12 pm

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

I think the short answer is because you always want a little more that is not covered by the framework.

I have built and sold applications built on products like this back in the 80s, I suppose it worked well enough for simple business needs.

Reminds me of four things:

IBM System/36 Query.

dBase III (and all its clones).

Jim Button's "PC File", which was all the rage in the 1980's.

Microsoft Access.

Though all exhibited the usual problem with such systems: they were great for simple tasks but hit a wall for more complex ones. Furthermore, most simple tasks have at least one more complex requirement that turns what should be a simple job into an aggravating one.

Though MS Access would let you break through the wall with its inbuilt Visual Basic for Applications (VBA), but it was no better than any other programming environment.

dBase III and friends would let you break through the wall with its inbuilt godawful whatever-it-was language, which was worse than every other programming language by a long shot.

Quote from tobega on January 29, 2021, 12:12 pm

Quote from dandl on January 29, 2021, 12:00 am

Yes to essential complexity, not to cache update.

IMO the CRUD for most ordinary WOD (filing cabinet) applications can be defined by a 'universal' metadata schema. I am specifically excluding any kind of computational intervention, just CRUD across a complex data model. The Schema consists of about 20 tables:

Storage: tables, columns, uniqueness keys, storage types, FK dependencies

Access: views, fields, search keys, presentation types

Form sisplay/update: forms, fields, search keys

Tabular display/update: grids, fields, search keys

UI Layout: Master/detail, grids/forms, placement (optional)

List output: reports, fields, selection, sort order, totalling

I wrote this once as NAXL using Rails and SQL generation, but a new version of Rails broke it. I tried it using Windows but the WPF UI was too hard.

I forked and reworked a project called Evolutility, with some success. I wrote another NAXL using JS and React and NoSQL but again the toolset just didn't seem to gel. It was always the technology that made it hard, conceptually it's not complicated. Defeated by accidental complexity yet again. Look here: https://github.com/david-pfx?tab=repositories.

It still doesn't look that hard, just quite a lot of work, so why is no-one trying this? There have been on-line offerings: one called Knack is quite close. But why don't we build apps this way? Really?

I think the short answer is because you always want a little more that is not covered by the framework.

I have built and sold applications built on products like this back in the 80s, I suppose it worked well enough for simple business needs.

Reminds me of four things:

IBM System/36 Query.
dBase III (and all its clones).
Jim Button's "PC File", which was all the rage in the 1980's.
Microsoft Access.

Though all exhibited the usual problem with such systems: they were great for simple tasks but hit a wall for more complex ones. Furthermore, most simple tasks have at least one more complex requirement that turns what should be a simple job into an aggravating one.

Though MS Access would let you break through the wall with its inbuilt Visual Basic for Applications (VBA), but it was no better than any other programming environment.

dBase III and friends would let you break through the wall with its inbuilt godawful whatever-it-was language, which was worse than every other programming language by a long shot.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The Forum for Discussion about The Third Manifesto and Related Matters

NoSQL and SQL and 'relational' -- a mess of disinformation