Pre-relational database models' influence on "theoreticians"

#1 · August 3, 2019, 6:53 am

There's somebody I would describe as a RM troll just materialised on StackOverflow. I don't think it's Fabien under an alias(?)

The great bulk of "literature" and textbooks that is promoted and marketed as "relational" is in fact anti-relational. Pre-1970 Record Filing Systems with relational labels, which is evidently all that the many authors and "theoreticians" can understand.

(I say "just materialised" because I've not seen answers/comment from them over the past few years. Their SO profile seems to suggest they were active some time before. That SO q would be an example I had in mind of where junior programmers fail to understand a business context and think fields appear at random in their applications, and they need to build systems/schemas to cope with just anything.)

These "many authors and "theoreticians"" apparently include D&D (see comment chain) -- who I would have thought the least guilty of pre-relational thinking.

Refelecting on John's story of ancient IT history

It was a pretty wild and woolly world, with a sharp divide between "mainframe systems" (very official, very inflexible) and "minicomputer systems" (not so official, but still important to the business, and very much "do what the customer says he wants",

-- and I can't deny that I'm guilty of some programming/systems design I'd now be ashamed of, on 1970's/early 80's minicomputers;

is it pre-relational thinking that's holding back systems from being Truly Relational^TM? I suppose that's true to the extent that SQL's shackle of backwards compatibility still harbours pre-relational approaches.

But much more of an influence these days is Object-Oriented and Object-Relational thinking.

What would be the symptoms of pre-relational thinking in a schema design?

What would be the symptoms of OO or OR thinking?

#2 · August 3, 2019, 7:21 am

Quote from Dave Voorhis on August 3, 2019, 7:21 am

Quote from AntC on August 3, 2019, 6:53 am

There's somebody I would describe as a RM troll just materialised on StackOverflow. I don't think it's Fabien under an alias(?)

The great bulk of "literature" and textbooks that is promoted and marketed as "relational" is in fact anti-relational. Pre-1970 Record Filing Systems with relational labels, which is evidently all that the many authors and "theoreticians" can understand.

(I say "just materialised" because I've not seen answers/comment from them over the past few years. Their SO profile seems to suggest they were active some time before. That SO q would be an example I had in mind of where junior programmers fail to understand a business context and think fields appear at random in their applications, and they need to build systems/schemas to cope with just anything.)

These "many authors and "theoreticians"" apparently include D&D (see comment chain) -- who I would have thought the least guilty of pre-relational thinking.

Refelecting on John's story of ancient IT history

It was a pretty wild and woolly world, with a sharp divide between "mainframe systems" (very official, very inflexible) and "minicomputer systems" (not so official, but still important to the business, and very much "do what the customer says he wants",

-- and I can't deny that I'm guilty of some programming/systems design I'd now be ashamed of, on 1970's/early 80's minicomputers;

is it pre-relational thinking that's holding back systems from being Truly Relational^TM? I suppose that's true to the extent that SQL's shackle of backwards compatibility still harbours pre-relational approaches.

But much more of an influence these days is Object-Oriented and Object-Relational thinking.

What would be the symptoms of pre-relational thinking in a schema design?

What would be the symptoms of OO or OR thinking?

I'm not sure the relational thinkers can even agree on what Truly Relational^TM should be, and database design seems to be at best an ad-hoc craft rather than science or algorithmic process. The best of both relational and OO thinking tends to wind up with essentially the same kind of designs for exactly the same reasons, but most practitioners of both are like most practitioners of everything: mediocre in ability, and not interested in reflecting on it let alone getting better at it. There's much abomination in both, like database designers who insist on putting everything in one big table -- or arbitrarily divided into a few tables -- and programmers who create "object oriented" programs with one class (or several, again chosen arbitrarily) and only static methods.

The commenter identified as "PerformanceDBA" on StackOverflow was on this forum for a while in its email form, notable for being the only member I've ever actively banned. He went under a different name, perhaps his real one, here.

Quote from AntC on August 3, 2019, 6:53 am

There's somebody I would describe as a RM troll just materialised on StackOverflow. I don't think it's Fabien under an alias(?)

The great bulk of "literature" and textbooks that is promoted and marketed as "relational" is in fact anti-relational. Pre-1970 Record Filing Systems with relational labels, which is evidently all that the many authors and "theoreticians" can understand.

(I say "just materialised" because I've not seen answers/comment from them over the past few years. Their SO profile seems to suggest they were active some time before. That SO q would be an example I had in mind of where junior programmers fail to understand a business context and think fields appear at random in their applications, and they need to build systems/schemas to cope with just anything.)

These "many authors and "theoreticians"" apparently include D&D (see comment chain) -- who I would have thought the least guilty of pre-relational thinking.

Refelecting on John's story of ancient IT history

It was a pretty wild and woolly world, with a sharp divide between "mainframe systems" (very official, very inflexible) and "minicomputer systems" (not so official, but still important to the business, and very much "do what the customer says he wants",

-- and I can't deny that I'm guilty of some programming/systems design I'd now be ashamed of, on 1970's/early 80's minicomputers;

is it pre-relational thinking that's holding back systems from being Truly Relational^TM? I suppose that's true to the extent that SQL's shackle of backwards compatibility still harbours pre-relational approaches.

But much more of an influence these days is Object-Oriented and Object-Relational thinking.

What would be the symptoms of pre-relational thinking in a schema design?

What would be the symptoms of OO or OR thinking?

I'm not sure the relational thinkers can even agree on what Truly Relational^TM should be, and database design seems to be at best an ad-hoc craft rather than science or algorithmic process. The best of both relational and OO thinking tends to wind up with essentially the same kind of designs for exactly the same reasons, but most practitioners of both are like most practitioners of everything: mediocre in ability, and not interested in reflecting on it let alone getting better at it. There's much abomination in both, like database designers who insist on putting everything in one big table -- or arbitrarily divided into a few tables -- and programmers who create "object oriented" programs with one class (or several, again chosen arbitrarily) and only static methods.

The commenter identified as "PerformanceDBA" on StackOverflow was on this forum for a while in its email form, notable for being the only member I've ever actively banned. He went under a different name, perhaps his real one, here.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#3 · August 3, 2019, 9:10 am

... notable for being the only member I've ever actively banned.

Ah, that was going to be my second guess. Yes I wondered if I recognised the rhetorical 'style' (if that's the word).

#4 · August 3, 2019, 10:06 am

Leaving aside the relational rant, how would we propose to solve this problem in a TTM/TD context? The IM might be one way, but assuming no IM then what?

I've done this in the past but not using his A or B. My choice was to use two tables, one for strings and one for numerics, and cast as needed. One could well be enough.

Andl - A New Database Language - andl.org

#5 · August 3, 2019, 10:58 pm

Quote from AntC on August 3, 2019, 6:53 am

What would be the symptoms of pre-relational thinking in a schema design?

One obvious indication would be violating 1NF by stuffing an arbitrary number of values into a table location, as by encoding numbers in a string with comma separators. But I do not think this indicates a history of pre-relational thinking; it is one of those obvious but bad ideas that people can easily reinvent time after time.

What would be the symptoms of OO or OR thinking?

Hmm. Making sure that every row in the entire database has a distinct ID, and then using them as quasi-pointers?

#6 · August 4, 2019, 7:49 am

Quote from AntC on August 4, 2019, 7:49 am

Quote from johnwcowan on August 3, 2019, 10:58 pm

Quote from AntC on August 3, 2019, 6:53 am

What would be the symptoms of pre-relational thinking in a schema design?

One obvious indication would be violating 1NF by stuffing an arbitrary number of values into a table location, as by encoding numbers in a string with comma separators.

Not exactly that, but some desktop DBMSs (FilemakerPro, and something I worked on Mac servers) claim a feature of 'repeating groups'. And those have now extended to something that's indistinguishable from RVAs.

But I do not think this indicates a history of pre-relational thinking; it is one of those obvious but bad ideas that people can easily reinvent time after time.

There are traces of such an idea (RVAs) in Codd 1969 -- as Chris Date pointed out a few years ago. That's one of the few ideas that didn't make it through to the better-known Codd 1970. And purely by coincidence, Fabian has just posted an explanation: Codd thought they would involve second-order logic. Discussion on this forum (a long time ago) thought probably it wouldn't.

I think one shibboleth for pre-relational would be 'Primary Key': filestore/indexed-sequential systems typically supported only one key with a UNIQUE constraint; and FOREIGN KEYs (if supported) were compelled to reference only that key. 'Secondary indexes' typically couldn't impose UNIQUE constraints (to avoid a performance hit on update), and searching/querying was noticeably slower. They were intended for (say) lookup by name or phone number to get the unique CustomerID, with those new-fangled 'online applications'. This has a curious correlation with your next point ...

What would be the symptoms of OO or OR thinking?

Hmm. Making sure that every row in the entire database has a distinct ID, and then using them as quasi-pointers?

Good point, as examplified by that SO question I started from. 'PerformanceDBA' was correct to call that out.

Is that OO thinking? Or is that because most visual IDEs for schema definition automatically create an ID of integral type as first column, and make it the key? Do junior programmers think that must be 'best practice'? It is a proverbial pain to override to a non-numeric ID.

Come to think ... On one project I had to rescue, the propellor-heads had built a reporting 'datamart' over a business application. The application's data model had 'meaningless' integral PRIMARY KEYs on every row -- including on 'bridging tables' whose only purpose was to link other 'meaningless' IDs. The 'datamart' then allocated a whole new layer of 'meaningless' IDs to shadow the IDs that were already meaningless. Some argument about this being standard practice in datamarts/slowly-changing dimensions blah, blah. But the application's meaningless IDs were already persistent, with all the changing dimensions non-key.

I was told I was being old-fashioned for worrying about disk space. No that wasn't my worry: it was the bottleneck on the GUID-allocator. Did I mention these IDs had to be globally unique, not just unique within their table?

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

Quote from johnwcowan on August 3, 2019, 10:58 pm

Quote from AntC on August 3, 2019, 6:53 am

What would be the symptoms of pre-relational thinking in a schema design?

One obvious indication would be violating 1NF by stuffing an arbitrary number of values into a table location, as by encoding numbers in a string with comma separators.

Not exactly that, but some desktop DBMSs (FilemakerPro, and something I worked on Mac servers) claim a feature of 'repeating groups'. And those have now extended to something that's indistinguishable from RVAs.

But I do not think this indicates a history of pre-relational thinking; it is one of those obvious but bad ideas that people can easily reinvent time after time.

There are traces of such an idea (RVAs) in Codd 1969 -- as Chris Date pointed out a few years ago. That's one of the few ideas that didn't make it through to the better-known Codd 1970. And purely by coincidence, Fabian has just posted an explanation: Codd thought they would involve second-order logic. Discussion on this forum (a long time ago) thought probably it wouldn't.

I think one shibboleth for pre-relational would be 'Primary Key': filestore/indexed-sequential systems typically supported only one key with a UNIQUE constraint; and FOREIGN KEYs (if supported) were compelled to reference only that key. 'Secondary indexes' typically couldn't impose UNIQUE constraints (to avoid a performance hit on update), and searching/querying was noticeably slower. They were intended for (say) lookup by name or phone number to get the unique CustomerID, with those new-fangled 'online applications'. This has a curious correlation with your next point ...

What would be the symptoms of OO or OR thinking?

Hmm. Making sure that every row in the entire database has a distinct ID, and then using them as quasi-pointers?

Good point, as examplified by that SO question I started from. 'PerformanceDBA' was correct to call that out.

Is that OO thinking? Or is that because most visual IDEs for schema definition automatically create an ID of integral type as first column, and make it the key? Do junior programmers think that must be 'best practice'? It is a proverbial pain to override to a non-numeric ID.

Come to think ... On one project I had to rescue, the propellor-heads had built a reporting 'datamart' over a business application. The application's data model had 'meaningless' integral PRIMARY KEYs on every row -- including on 'bridging tables' whose only purpose was to link other 'meaningless' IDs. The 'datamart' then allocated a whole new layer of 'meaningless' IDs to shadow the IDs that were already meaningless. Some argument about this being standard practice in datamarts/slowly-changing dimensions blah, blah. But the application's meaningless IDs were already persistent, with all the changing dimensions non-key.

I was told I was being old-fashioned for worrying about disk space. No that wasn't my worry: it was the bottleneck on the GUID-allocator. Did I mention these IDs had to be globally unique, not just unique within their table?

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

#7 · August 4, 2019, 8:54 am

Quote from Dave Voorhis on August 4, 2019, 8:54 am

Quote from AntC on August 4, 2019, 7:49 am

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

I consider myself an "OO thinker", having written object-oriented code on an almost daily basis since the mid-1980's. The notion that object-orientation revolves around "object identity" is a categorisation that often comes from OO outsiders, almost never from insiders. I very rarely have any reason to think about object identity. It just doesn't come up. On the other hand, I often have to think about how to define one instance of a given class as being equal to another.

Therefore, I'd say GUIDs aren't a reliable indicator of OO thinking. Insistence on numerical surrogate keys everywhere is a database practitioner smell that I saw long before object oriented programming became prevalent, and was often promulgated by folks who knew nothing about object orientation, or even programming. Its proponents have their reasons, which I won't go into here.

I'd say object oriented thinking is identified by an inclination toward:

Casual and frequent creation of classes to represent types, modules, and records, and finding it reasonable to use the one class mechanism for all three;¹

Strongly coupling data and procedures that manipulate it, typically tightly bundled into classes or loosely bundled via (if available) multimethods;

Use of inheritance (or interface implementation) and polymorphism to support coding to abstractions (base classes and/or interfaces) with one or more concrete implementations of each abstraction.

It's pretty difficult to demonstrate any of that way of thinking, for good or bad, in typical database (i.e., SQL) environments.

--

¹ I knew of a business that required its programmers to submit a formal request to create a new class whenever one was needed in the company software. The request was ratified (or not) by executive, and each approved class formally announced as a business achievement. E.g., "On Tuesday the 23^rd, the software development team was pleased to announce they received approval from Executive to proceed with creation of a new BspCustFlbAnnce class, which will permit us to capture and manipulate information about blah blah blah ... leading to successful competition in such and such markets." That business was not demonstrating object-oriented thinking!

Quote from AntC on August 4, 2019, 7:49 am

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

I consider myself an "OO thinker", having written object-oriented code on an almost daily basis since the mid-1980's. The notion that object-orientation revolves around "object identity" is a categorisation that often comes from OO outsiders, almost never from insiders. I very rarely have any reason to think about object identity. It just doesn't come up. On the other hand, I often have to think about how to define one instance of a given class as being equal to another.

Therefore, I'd say GUIDs aren't a reliable indicator of OO thinking. Insistence on numerical surrogate keys everywhere is a database practitioner smell that I saw long before object oriented programming became prevalent, and was often promulgated by folks who knew nothing about object orientation, or even programming. Its proponents have their reasons, which I won't go into here.

I'd say object oriented thinking is identified by an inclination toward:

Casual and frequent creation of classes to represent types, modules, and records, and finding it reasonable to use the one class mechanism for all three;¹
Strongly coupling data and procedures that manipulate it, typically tightly bundled into classes or loosely bundled via (if available) multimethods;
Use of inheritance (or interface implementation) and polymorphism to support coding to abstractions (base classes and/or interfaces) with one or more concrete implementations of each abstraction.

It's pretty difficult to demonstrate any of that way of thinking, for good or bad, in typical database (i.e., SQL) environments.

--

¹ I knew of a business that required its programmers to submit a formal request to create a new class whenever one was needed in the company software. The request was ratified (or not) by executive, and each approved class formally announced as a business achievement. E.g., "On Tuesday the 23^rd, the software development team was pleased to announce they received approval from Executive to proceed with creation of a new BspCustFlbAnnce class, which will permit us to capture and manipulate information about blah blah blah ... leading to successful competition in such and such markets." That business was not demonstrating object-oriented thinking!

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#8 · August 4, 2019, 9:00 am

Quote from AntC on August 4, 2019, 7:49 am

Quote from johnwcowan on August 3, 2019, 10:58 pm

Quote from AntC on August 3, 2019, 6:53 am

What would be the symptoms of pre-relational thinking in a schema design?

One obvious indication would be violating 1NF by stuffing an arbitrary number of values into a table location, as by encoding numbers in a string with comma separators.

Not exactly that, but some desktop DBMSs (FilemakerPro, and something I worked on Mac servers) claim a feature of 'repeating groups'. And those have now extended to something that's indistinguishable from RVAs.

I'm reminded of PICK and other so-called "multivalue" database systems, which make violation of 1NF a guiding principle.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#9 · August 4, 2019, 11:11 am

Quote from Dave Voorhis on August 4, 2019, 8:54 am

Quote from AntC on August 4, 2019, 7:49 am

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

I consider myself an "OO thinker", having written object-oriented code on an almost daily basis since the mid-1980's. The notion that object-orientation revolves around "object identity" is a categorisation that often comes from OO outsiders, almost never from insiders. I very rarely have any reason to think about object identity. It just doesn't come up. On the other hand, I often have to think about how to define one instance of a given class as being equal to another.

My background is somewhat similar, but with a strong focus on building software with 'walls': stringent controls over dependencies and visibility. I was writing OO-like code in C before there was C++, for this sole purpose. I care about abstraction, encapsulation, separation of concerns far more than inheritance and polymorphism.

If I detest SQL and the databases that depend on it, it's not for their lack of objects, but for their lack of locality. Everything is public, out there for all to see (and mess with). We're never more than one GRANT from the apocalypse.

Andl - A New Database Language - andl.org

#10 · August 4, 2019, 11:31 am

Quote from dandl on August 4, 2019, 11:11 am

Quote from Dave Voorhis on August 4, 2019, 8:54 am

Quote from AntC on August 4, 2019, 7:49 am

Then can we say GUIDs (database-global) are a more reliable indicator of OO thinking?

I consider myself an "OO thinker", having written object-oriented code on an almost daily basis since the mid-1980's. The notion that object-orientation revolves around "object identity" is a categorisation that often comes from OO outsiders, almost never from insiders. I very rarely have any reason to think about object identity. It just doesn't come up. On the other hand, I often have to think about how to define one instance of a given class as being equal to another.

My background is somewhat similar, but with a strong focus on building software with 'walls': stringent controls over dependencies and visibility. I was writing OO-like code in C before there was C++, for this sole purpose. I care about abstraction, encapsulation, separation of concerns far more than inheritance and polymorphism.

I also care about abstraction, encapsulation, and separation of concerns. Inheritance and polymorphism are a means -- not the only one, obviously -- of implementing separation of concerns between general abstractions and specific implementations.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The Forum for Discussion about The Third Manifesto and Related Matters

Pre-relational database models' influence on "theoreticians"