The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Object-Relational hybrids and `id` as Primary key

PreviousPage 2 of 2
Quote from AntC on February 14, 2023, 12:08 pm
Quote from dandl on February 14, 2023, 9:47 am

 

... when they rock up in the middle of the night or give a different spelling of the name, or the wrong/duplicate UR number when it's actually a relative with the same name.

Heh, heh. One of the projects I was dragged into -- and I'll have to kill you if I tell too much -- was tracing crims (not) paying fines. They'd do anything to have their 'career' at one end of the country not connected to their new identity at the other. The police/court systems weren't nationally co-ordinated, so it fell to a central government long after the court case to try to connect everything up. Yes there was a messy process to try to link them. But occasionally they linked too many. (Father and son, same named, who occasionally did live at the same address.) So you couldn't blat all the data into one id/there was an even messier process to partially unlink.

It's sounding more and more like a retake on Rails.

Is this because they're all trying to graft OO on top of Relational, and doing it badly? Under OO everything is an Object, so must have a (GU)ID. There might be arbitrarily complex data items 'belonging to' that object -- IOW this is a hierarchical data structure.

So an Invoice 'belongs to' a customer; an Invoice line 'belongs to' an Invoice; a Product sold 'belongs to' an Invoice line. Except a Product also 'belongs to' a warehouse/sitting on the shelves; and 'belongs to' a Purchase line to restock it. Etc. I probably mean an instance of Product/there's no instance in two places at the same time, but we're not going to individually barcode each can of beans as it goes through the warehouse.

So does OO have a concept of disparate Objects linking in multiple roles to multiple other Objects? How does that go with encapsulation? The linking is not a message passed at a point in time, but a permanent connection.

At runtime OO has instances (of classes, usually) that reference other instances. The various ways of doing that don't necessarily violate encapsulation. The linking is done via a "message passed" -- more typically a method call in modern languages, though conceptually the same thing -- that says this is related to that, but the connection is only permanent insofar as it's not explicitly changed by the same message/method establishing a new connection.

Sounds like you're describing an Object-Relational Mapper, a device that the sales brochure claims will allow you to easily persist instances and (a) be able to switch DBMS products (a thing that almost never happens during the lifetime of a product but is usually relatively easy when it does happen); and (b) allows mediocre developers to avoid having to write any SQL by exchanging simple and straightforward SQL query invocations for a labyrinthine morass of intricate object graphs, complexity, poor performance, eye-poppingly weird breakage, and almost endlessly frustrating debugging.

Autogenerated and auto-incremented integer or UUID primary keys -- often called 'id' -- are typical in ORMs.

ORM pitfalls are numerous and painful with little to no benefit -- except perhaps on university assignment-sized projects -- but their advocates use them with great delight and insist that, for example, typically spending weeks debugging the obscure bugginess they cause, on a per-bug basis, is worth it for all the benefits they provide. (See above re "switch DBMS products" and "avoid having to write any SQL" for the full list of those benefits.)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on February 14, 2023, 10:45 pm

..., but the connection is only permanent insofar as it's not explicitly changed by the same message/method establishing a new connection.

Thanks Dave, pretty much as I suspected.

What (I see as) permanent is:

  • entities have a business key -- that is, not id, and as a key it must be unique across all entities/instances in that class/table. How do you represent that with message passing?
  • 'foreign keys' -- that is, not necessarily pointing to ids -- mustn't be left to 'dangle' after a deletion or update at the target instance/table; where I mean 'target' of the key, not target of any message.

ORM pitfalls are numerous and painful with little to no benefit -- except perhaps on university assignment-sized projects

(yeah, I suspect that's what raised the q)

... "avoid having to write any SQL" for the full list of those benefits.)

I guess ... if your assignment only goes so far as single-row insert/update/delete; and single-row lookup to validate some transaction.

The SO q was ostensibly about normalisation and superfluous appearance of (foreign) keys. But my answer more had in mind suitability for writing typical SQL queries, as opposed to row-by-agonising-row searches.

(a) be able to switch DBMS products (a thing that almost never happens during the lifetime of a product but is usually relatively easy when it does happen);

But doesn't that just mean lock-in to a particular ORM, instead of lock-in (alleged) to a particular DBMS? I agree with your "relatively easy".

What if our university graduate knows all about Laravel, but then goes for a job at a Ruby-on-Rails site or whatever? Do they have any higher-level understanding of 'Relational' -- at least to the extent you can abstract from SQL to 'Relational'? This particular SO question's proposed design was a viper's nest of FKs in all directions, exhibiting no understanding how easy it is to produce a join view via the (hated) SQL -- you can join so many as six (gasp!) tables all in the same query.

Quote from AntC on February 15, 2023, 4:21 am
Quote from Dave Voorhis on February 14, 2023, 10:45 pm

..., but the connection is only permanent insofar as it's not explicitly changed by the same message/method establishing a new connection.

Thanks Dave, pretty much as I suspected.

What (I see as) permanent is:

  • entities have a business key -- that is, not id, and as a key it must be unique across all entities/instances in that class/table. How do you represent that with message passing?

ORMs typically provide a way of identifying keys, whether business keys or auto-generated or both.

Note that it's not considered "message passing" -- that terminology that hasn't been popular in object oriented programming since early versions of Smalltalk (and usually hasn't been implemented in popular object oriented programming languages using actual message passing since early versions of Smalltalk) -- but it's typically specified via method invocations (i.e., procedure calls in an object oriented language) or some declarative mechanism such as Java annotations, which internally perform method invocations.

  • 'foreign keys' -- that is, not necessarily pointing to ids -- mustn't be left to 'dangle' after a deletion or update at the target instance/table; where I mean 'target' of the key, not target of any message.

ORMs often provide facilities to automatically purge or "detach" dangling references of that sort.

ORM pitfalls are numerous and painful with little to no benefit -- except perhaps on university assignment-sized projects

(yeah, I suspect that's what raised the q)

... "avoid having to write any SQL" for the full list of those benefits.)

I guess ... if your assignment only goes so far as single-row insert/update/delete; and single-row lookup to validate some transaction.

The SO q was ostensibly about normalisation and superfluous appearance of (foreign) keys. But my answer more had in mind suitability for writing typical SQL queries, as opposed to row-by-agonising-row searches.

(a) be able to switch DBMS products (a thing that almost never happens during the lifetime of a product but is usually relatively easy when it does happen);

But doesn't that just mean lock-in to a particular ORM, instead of lock-in (alleged) to a particular DBMS? I agree with your "relatively easy".

Yes, and lock-in to a particular ORM is usually far harder to escape than "lock-in" to a particular DBMS.

But the ORM vendors don't want you to know that.

What if our university graduate knows all about Laravel, but then goes for a job at a Ruby-on-Rails site or whatever? Do they have any higher-level understanding of 'Relational' -- at least to the extent you can abstract from SQL to 'Relational'? This particular SO question's proposed design was a viper's nest of FKs in all directions, exhibiting no understanding how easy it is to produce a join view via the (hated) SQL -- you can join so many as six (gasp!) tables all in the same query.

If our grad knows all about Laravel and goes to a Ruby-on-Rails site, he'll have some general understanding of ORMs and might have picked up some appreciation for the usual ORM quirks, though (speaking aside, here) he's already in the technical doldrums, having focused -- or been focused -- on PHP (Laravel) and Ruby (Ruby-on-Rails), which are both in serious industrial decline. He'd be better off learning Hibernate (Java), Entity Framework (.NET/C#), or whatever is used with Node.js and Python these days.

One ORM is pretty much like another, they're all mostly awful -- except for assignment-sized projects -- and those who learn them to the exclusion of SQL wind up being pretty much useless with ORMs, too.

That's because using ORMs almost inevitably introduces performance bottlenecks that are addressed by either writing native SQL (usually in some ORM-aware fashion) or some SQL-like language provided by the ORM. For example, Hibernate has its own "query language" called HQL (see https://docs.jboss.org/hibernate/core/3.3/reference/en/html/queryhql.html) which is a gloriously foul blending of the worst aspects of SQL and the worst aspects of object oriented programming.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

Sorry to but in, but has anybody mentioned TTM's RM VSS1?  Perhaps it's not really relevant to the discussion at hand, which seems mostly about when to and when not to use it.

Hugh

Coauthor of The Third Manifesto and related books.

I just want to quibble over 'message passing'. This is a technique whereby arguments to a function are bundled into a package along with some kind of receiver ID. The point being the sender does not know who the receiver will be, the message handling framework makes that decision, typically at runtime. It was already widely used in C programming long before SmallTalk, and provides an excellent way to build highly modular systems of small components. I seem to recall it was the foundation of Minix.

Andl - A New Database Language - andl.org
Quote from Hugh on February 15, 2023, 11:23 am

Sorry to but in, but has anybody mentioned TTM's RM VSS1?  Perhaps it's not really relevant to the discussion at hand, which seems mostly about when to and when not to use it.

 

Thanks Hugh, yes RM VSS1 is implicitly in the background of this discussion.

By all means a "D should provide a mechanism ...". What we're objecting to is the tool making Very Strong Suggestions that every relvar's (only) candidate key be "supplied by the system".  Furthermore that foreign keys point only to a relvar's system-supplied key.

Quote from dandl on February 15, 2023, 12:28 pm

I just want to quibble over 'message passing'. This is a technique whereby arguments to a function are bundled into a package along with some kind of receiver ID. The point being the sender does not know who the receiver will be, the message handling framework makes that decision, typically at runtime. It was already widely used in C programming long before SmallTalk, and provides an excellent way to build highly modular systems of small components. I seem to recall it was the foundation of Minix.

Thanks, and thanks to Dave for correcting my use of outdated terminology.

What I'm unhappy with -- whether message-passing or method invocations -- is that setting up foreign keys is one-time/async, not 'permanent'.

ORMs often provide facilities to automatically purge or "detach" dangling references of that sort.

But not a mechanism to reject a transaction that will leave a dangling reference? That's been caused because a junior programmer hasn't grokked the vipers' nest of links.

Presumably if the ORM has an actual SQL DBMS behind it, with actual REFERENCES declared in the table, there will be an inscrutable SQL error. Or do these tools auto-declare ON DELETE CASCADE?

Quote from AntC on February 15, 2023, 9:36 pm
Quote from dandl on February 15, 2023, 12:28 pm

I just want to quibble over 'message passing'. This is a technique whereby arguments to a function are bundled into a package along with some kind of receiver ID. The point being the sender does not know who the receiver will be, the message handling framework makes that decision, typically at runtime. It was already widely used in C programming long before SmallTalk, and provides an excellent way to build highly modular systems of small components. I seem to recall it was the foundation of Minix.

Thanks, and thanks to Dave for correcting my use of outdated terminology.

What I'm unhappy with -- whether message-passing or method invocations -- is that setting up foreign keys is one-time/async, not 'permanent'.

It can be permanent, or at least to the extent that it's permanent in the SQL database that underpins it, which is only an ALTER or two from changing it.

ORMs often provide facilities to automatically purge or "detach" dangling references of that sort.

But not a mechanism to reject a transaction that will leave a dangling reference?

Yes, in some ORMs you can reject such a transaction.

That's been caused because a junior programmer hasn't grokked the vipers' nest of links.

Presumably if the ORM has an actual SQL DBMS behind it, with actual REFERENCES declared in the table, there will be an inscrutable SQL error. Or do these tools auto-declare ON DELETE CASCADE?

Yes, they can auto-declare ON DELETE CASCADE. Or you can write it explicitly, along with using SQL for the rest of the database schema implementation and tell the ORM to use it.

There's a lot of variety in the ORM space, and plenty of capability. The common factor is that they all invisibly turn instance interactions into corresponding auto-generated SQL invocations, along with often allowing native SQL or SQL-like languages to be used to create explicit queries.

In principle, they're great -- you don't have to plow through any SQL drudgery to get INSERT, UPDATE, DELETE, and SELECT written on your behalf, and you conveniently interact with records in the form of class instances.

In practice, as soon as you scale beyond database-driven toys, they're awful.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on February 15, 2023, 9:25 pm
Quote from Hugh on February 15, 2023, 11:23 am

Sorry to but in, but has anybody mentioned TTM's RM VSS1?  Perhaps it's not really relevant to the discussion at hand, which seems mostly about when to and when not to use it.

 

Thanks Hugh, yes RM VSS1 is implicitly in the background of this discussion.

By all means a "D should provide a mechanism ...". What we're objecting to is the tool making Very Strong Suggestions that every relvar's (only) candidate key be "supplied by the system".  Furthermore that foreign keys point only to a relvar's system-supplied key.

Yes, that's crazy.  I missed the "only" bit and the "every" bit is clearly wrong anyway.

Hugh

 

Coauthor of The Third Manifesto and related books.
PreviousPage 2 of 2