The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Inappropriate responses to points about language design in SQL

Page 1 of 5Next

Sometimes, when I point out an aspect of SQL that I opine to be an example of bad language design, a respondent goes to the trouble of showing me how to get around the problems that particular feature might engender.  This happened again in the discussion I raised on range variables and I have to say I find it rather annoying.

The point is that if it is not that case that all types are first-class (for example), then most people would agree that is a bit unfortunate.  The same is true if it is the case that not all expressions denoting values can appear nested, as in x/(y+z), where y+z is "nested".  The design of such languages is clearly questionable, especially when the missing features are ones that most people would expect to be able to take for granted.

I am currently engaged in an extremely complicated (for me) exercise using Rel.  I have developed a system in Rel for finding solutions to the problems Gerard Joseph and I investigated in Beyond The Four Fours.  For performance reasons, I save numerous intermediate results by assigning them to base relvars.  The fact that every relation expression in Tutorial D can appear on the RHS of a direct assignment is a boon for me, as is the ability to specify a base relvar's type using the SAME_TYPE_AS notation.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on November 7, 2019, 5:33 pm

Sometimes, when I point out an aspect of SQL that I opine to be an example of bad language design, a respondent goes to the trouble of showing me how to get around the problems that particular feature might engender.  This happened again in the discussion I raised on range variables and I have to say I find it rather annoying.

People who live and breathe some programming language are unlikely to be aware of its limitations -- indeed what you think is 'getting around problems' they might think is showing off its power and expressivity. (Pointing out problems is a bit like telling a fish there are other mediums than water.) Three cases in point follow ...

For example: when I point out that TTM/Tutorial D has no way to express tagged unions, a respondent (Dave) goes to the trouble of showing me how to get around that problem using the IM and UNION types. I have to say I find it rather annoying.

The point is that if it is not that case that all types are first-class (for example), then most people would agree that is a bit unfortunate.  The same is true if it is the case that not all expressions denoting values can appear nested, as in x/(y+z), where y+z is "nested".  The design of such languages is clearly questionable, especially when the missing features are ones that most people would expect to be able to take for granted.

I remember debating with an RPG weenie, who claimed RPG was "self-documenting". Furthermore he thought it a positive feature that you could not nest expressions. RPG is that so-called HLL that looks like assembler with card images and fixed column positions. You'd have to write that nesting as (column order is operand1 - op - operand2 - result)

    y    ADD  z     temp;

    x    DIV  temp  result;

Addit: (ah, more of that conversation is coming back to me. It has been over 30 years.) There is a point/feature to remark here (and he did, vehemently). For calculations in a commercial application with MONEY and rates to multiple decimals, for a nested expression the compiler can often give the wrong precision to intermediate results, losing accuracy. So a feature of RPG/by naming the intermediate result you can give a signature right in-line

    y    ADD  z     temp   8V6;   // 8 decimal places before the (virtual) decimal, 6 after

 

I am currently engaged in an extremely complicated (for me) exercise using Rel.  I have developed a system in Rel for finding solutions to the problems Gerard Joseph and I investigated in Beyond The Four Fours.  For performance reasons, I save numerous intermediate results by assigning them to base relvars.  The fact that every relation expression in Tutorial D can appear on the RHS of a direct assignment is a boon for me, as is the ability to specify a base relvar's type using the SAME_TYPE_AS notation.

You assign to base relvars? Is that any different to the way RPG approaches it? Why aren't you just writing a large nested expression in full? Most modern languages allow binding a name to an expression (especially useful for documenting your algorithm, and for debugging) without having to write some temp result out to storage. Furthermore if they're merely calculations, shouldn't they be VIRTUAL? Tutorial D seems to be lacking a feature there. But as a Tutorial D weenie, you wouldn't be aware of that.

Given the language's limitations, writing out to temp tables might be the only workable approach. I used the same tactic in my Sudoku solver (in Ms Access). (I tried making them VIEWs, but the db engine just ground to a halt.) I have seen solvers written as a single SQL SELECT statement. (Oracle DB for example allows statements up to some monstrous character limit; Ms Access not so much.)

Many languages have something equivalent to SAME_TYPE_AS. They maybe don't follow the Tutorial D way as such; are you going to allege that's a 'problem'? If Tutorial D had type inference of the power of Damas-Hindley-Milner, you'd not need type declarations like that at all: the compiler would figure two expressions/variables are the same type.

Frankly, I'd suspect Tutorial D is not very well suited to that particular task. (Just because you put the results in a database does not make it a 'data processing application'.) I'd use a language with higher-order types, first-class functions, partial application (operator sections) and utilities like maps and folds, lazy evaluation. Because I suspect you're doing a lot of searching down combinatorial trees: you want breadth-first, for which lazy evaluation fits hand in glove. I suspect you're writing code with unnecessary complexity to "get around the problems". You seem to be doing exactly what you find annoying in others.

Addit2: If I'm right in thinking the exercise is a combinatorial search, the natural way to approach it would be a recursive algorithm. But writing out intermediate results to storage banjaxes any hope of recursion. Then you have to take an iterative approach that picks up results-so-far from the intermediate variables; so obfuscating the logic with database-handling. Again my Sudoku solver suffered from this (no recursion in SQL).

Quote from Hugh on November 7, 2019, 5:33 pm

Sometimes, when I point out an aspect of SQL that I opine to be an example of bad language design, a respondent goes to the trouble of showing me how to get around the problems that particular feature might engender.  This happened again in the discussion I raised on range variables and I have to say I find it rather annoying.

The point is that if it is not that case that all types are first-class (for example), then most people would agree that is a bit unfortunate.  The same is true if it is the case that not all expressions denoting values can appear nested, as in x/(y+z), where y+z is "nested".  The design of such languages is clearly questionable, especially when the missing features are ones that most people would expect to be able to take for granted.

I am currently engaged in an extremely complicated (for me) exercise using Rel.  I have developed a system in Rel for finding solutions to the problems Gerard Joseph and I investigated in Beyond The Four Fours.  For performance reasons, I save numerous intermediate results by assigning them to base relvars.  The fact that every relation expression in Tutorial D can appear on the RHS of a direct assignment is a boon for me, as is the ability to specify a base relvar's type using the SAME_TYPE_AS notation.

Hugh

SQL is a badly designed language, on all kinds of levels. Please never think that I would defend the language itself for doing the indefensible. At the same time, SQL is widely used, effective and familiar. I can explain an idea with a fragment of SQL that might be hard to get across otherwise. This is intended to be using SQL as a lingua franca, not as an exemplar.

My aim was:

  1. To point out that SQL:
    1. has most of the operators of an RA
    2. has a kind of join that is close to the theta-join in Codd 1972 (as well as others including a natural join similar to Tutorial D)
    3. generally requires/allows programmers to specify the columns in a query explicitly by name.
  2. To ask/speculate how TTM might have turned out if it had made similar choices:
    1. a theta-join/equi-join based on explicitly naming join attributes
    2. generally requiring/allowing the programmer to explicitly name every input/output attribute by name (a weaker form of RM Pre 18).

The rationale for raising the topic is ultimately to ask/speculate whether such a choice would allow the particular heading-based relation and tuple type generators to be abandoned in favour of some kind of tuple/record/struct/class type that would be more familiar to users of other mainstream languages, without losing the many other benefits that TTM offers. In other words, if the relation and tuple type generators are the stumbling block, could there a TTM without them by following this path?

I gave examples of how that might look in both SQL and C# LINQ for illustrative purposes, not intended as a definitive answer or an argument in favour of either.

Andl - A New Database Language - andl.org
Quote from AntC on November 7, 2019, 10:08 pm

For example: when I point out that TTM/Tutorial D has no way to express tagged unions, a respondent (Dave) goes to the trouble of showing me how to get around that problem using the IM and UNION types. I have to say I find it rather annoying.

When you write this, I'm possessed with an almost irresistible urge to show you three or four more examples of the same, each more complex than the last and all relying on virtually every aspect of the IM including extensively clever use of inheritance. It's only my nearly superhuman restraint that keeps me from doing so.

Frankly, I'd suspect Tutorial D is not very well suited to that particular task. (Just because you put the results in a database does not make it a 'data processing application'.) I'd use a language with higher-order types, first-class functions, partial application (operator sections) and utilities like maps and folds, lazy evaluation. ...

Every programmer has languages they like and languages they don't, neither of which has much to do with what languages might be objectively suited -- or unsuited -- to any particular job. It's a good thing, too, because it means there's a great deal of language variety out there. The IT world would be pretty dull if there wasn't.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on November 8, 2019, 8:29 am
Quote from AntC on November 7, 2019, 10:08 pm

For example: when I point out that TTM/Tutorial D has no way to express tagged unions, a respondent (Dave) goes to the trouble of showing me how to get around that problem using the IM and UNION types. I have to say I find it rather annoying.

When you write this, I'm possessed with an almost irresistible urge to show you three or four more examples of the same, each more complex than the last and all relying on virtually every aspect of the IM including extensively clever use of inheritance. It's only my nearly superhuman restraint that keeps me from doing so.

Dave, you're responding to points I did not make (or at least not on this thread); and failing to observe the point I did make. The wording I used was mimicking Hugh's, to show that any language weenie can easily annoy any other language weenie. If I ever find myself wanting inheritance in the sense of Specialisation-by-Constraint I'll remember to ask your extensive clevership.

Frankly, I'd suspect Tutorial D is not very well suited to that particular task. (Just because you put the results in a database does not make it a 'data processing application'.) I'd use a language with higher-order types, first-class functions, partial application (operator sections) and utilities like maps and folds, lazy evaluation. ...

Every programmer has languages they like and languages they don't, neither of which has much to do with what languages might be objectively suited -- or unsuited -- to any particular job. It's a good thing, too, because it means there's a great deal of language variety out there. The IT world would be pretty dull if there wasn't.

I fear in Hugh's case, there's only three languages he knows: SQL, Tutorial D, Rexx -- oh and ISBL, I suppose. Not a broad enough "language variety" to make a judgment on suitability for some task. I suspect (ref Hugh's first post on this thread) that his usage of Tutorial D 'features' fall exactly into the category of "how to get around the problems that particular ["example of bad language design"] might engender." I'm not saying Tutorial D is badly designed as a tutorial-level data manipulation language. I am saying it's not designed for recursive combinatorial search.

I'd say Java is better suited for this task -- it has many of the features I listed.

 

Quote from AntC on November 8, 2019, 9:23 am
Quote from Dave Voorhis on November 8, 2019, 8:29 am
Quote from AntC on November 7, 2019, 10:08 pm

For example: when I point out that TTM/Tutorial D has no way to express tagged unions, a respondent (Dave) goes to the trouble of showing me how to get around that problem using the IM and UNION types. I have to say I find it rather annoying.

When you write this, I'm possessed with an almost irresistible urge to show you three or four more examples of the same, each more complex than the last and all relying on virtually every aspect of the IM including extensively clever use of inheritance. It's only my nearly superhuman restraint that keeps me from doing so.

Dave, you're responding to points I did not make (or at least not on this thread); and failing to observe the point I did make. The wording I used was mimicking Hugh's, to show that any language weenie can easily annoy any other language weenie. If I ever find myself wanting inheritance in the sense of Specialisation-by-Constraint I'll remember to ask your extensive clevership.

I responded to what you wrote, as quoted above.

And I was trying to wind you up. Looks like it worked.

Maybe I should have added a smiley, but it tends to unduly soften the wind-up.

Frankly, I'd suspect Tutorial D is not very well suited to that particular task. (Just because you put the results in a database does not make it a 'data processing application'.) I'd use a language with higher-order types, first-class functions, partial application (operator sections) and utilities like maps and folds, lazy evaluation. ...

Every programmer has languages they like and languages they don't, neither of which has much to do with what languages might be objectively suited -- or unsuited -- to any particular job. It's a good thing, too, because it means there's a great deal of language variety out there. The IT world would be pretty dull if there wasn't.

I fear in Hugh's case, there's only three languages he knows: SQL, Tutorial D, Rexx -- oh and ISBL, I suppose. Not a broad enough "language variety" to make a judgment on suitability for some task. I suspect (ref Hugh's first post on this thread) that his usage of Tutorial D 'features' fall exactly into the category of "how to get around the problems that particular ["example of bad language design"] might engender." I'm not saying Tutorial D is badly designed as a tutorial-level data manipulation language. I am saying it's not designed for recursive combinatorial search.

I'd say Java is better suited for this task -- it has many of the features I listed.

Given that Tutorial D is Hugh's invention, I've no doubt that he enjoys being able to use it for both mundane tasks and stretching its problem-solving boundaries -- as do I, as its implementor -- and as does anyone who creates or implements a computer language.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from AntC on November 7, 2019, 10:08 pm
Quote from Hugh on November 7, 2019, 5:33 pm

Sometimes, when I point out an aspect of SQL that I opine to be an example of bad language design, a respondent goes to the trouble of showing me how to get around the problems that particular feature might engender.  This happened again in the discussion I raised on range variables and I have to say I find it rather annoying.

People who live and breathe some programming language are unlikely to be aware of its limitations -- indeed what you think is 'getting around problems' they might think is showing off its power and expressivity. (Pointing out problems is a bit like telling a fish there are other mediums than water.) Three cases in point follow ...

For example: when I point out that TTM/Tutorial D has no way to express tagged unions, a respondent (Dave) goes to the trouble of showing me how to get around that problem using the IM and UNION types. I have to say I find it rather annoying.

The point is that if it is not that case that all types are first-class (for example), then most people would agree that is a bit unfortunate.  The same is true if it is the case that not all expressions denoting values can appear nested, as in x/(y+z), where y+z is "nested".  The design of such languages is clearly questionable, especially when the missing features are ones that most people would expect to be able to take for granted.

I remember debating with an RPG weenie, who claimed RPG was "self-documenting". Furthermore he thought it a positive feature that you could not nest expressions. RPG is that so-called HLL that looks like assembler with card images and fixed column positions. You'd have to write that nesting as (column order is operand1 - op - operand2 - result)

    y    ADD  z     temp;

    x    DIV  temp  result;

Addit: (ah, more of that conversation is coming back to me. It has been over 30 years.) There is a point/feature to remark here (and he did, vehemently). For calculations in a commercial application with MONEY and rates to multiple decimals, for a nested expression the compiler can often give the wrong precision to intermediate results, losing accuracy. So a feature of RPG/by naming the intermediate result you can give a signature right in-line

    y    ADD  z     temp   8V6;   // 8 decimal places before the (virtual) decimal, 6 after

I am currently engaged in an extremely complicated (for me) exercise using Rel.  I have developed a system in Rel for finding solutions to the problems Gerard Joseph and I investigated in Beyond The Four Fours.  For performance reasons, I save numerous intermediate results by assigning them to base relvars.  The fact that every relation expression in Tutorial D can appear on the RHS of a direct assignment is a boon for me, as is the ability to specify a base relvar's type using the SAME_TYPE_AS notation.

You assign to base relvars? Is that any different to the way RPG approaches it? Why aren't you just writing a large nested expression in full? Most modern languages allow binding a name to an expression (especially useful for documenting your algorithm, and for debugging) without having to write some temp result out to storage. Furthermore if they're merely calculations, shouldn't they be VIRTUAL? Tutorial D seems to be lacking a feature there. But as a Tutorial D weenie, you wouldn't be aware of that.

Given the language's limitations, writing out to temp tables might be the only workable approach. I used the same tactic in my Sudoku solver (in Ms Access). (I tried making them VIEWs, but the db engine just ground to a halt.) I have seen solvers written as a single SQL SELECT statement. (Oracle DB for example allows statements up to some monstrous character limit; Ms Access not so much.)

Many languages have something equivalent to SAME_TYPE_AS. They maybe don't follow the Tutorial D way as such; are you going to allege that's a 'problem'? If Tutorial D had type inference of the power of Damas-Hindley-Milner, you'd not need type declarations like that at all: the compiler would figure two expressions/variables are the same type.

Frankly, I'd suspect Tutorial D is not very well suited to that particular task. (Just because you put the results in a database does not make it a 'data processing application'.) I'd use a language with higher-order types, first-class functions, partial application (operator sections) and utilities like maps and folds, lazy evaluation. Because I suspect you're doing a lot of searching down combinatorial trees: you want breadth-first, for which lazy evaluation fits hand in glove. I suspect you're writing code with unnecessary complexity to "get around the problems". You seem to be doing exactly what you find annoying in others.

Addit2: If I'm right in thinking the exercise is a combinatorial search, the natural way to approach it would be a recursive algorithm. But writing out intermediate results to storage banjaxes any hope of recursion. Then you have to take an iterative approach that picks up results-so-far from the intermediate variables; so obfuscating the logic with database-handling. Again my Sudoku solver suffered from this (no recursion in SQL).

Is Tutorial D's failure to support "tagged unions" a consequence of bad language design?  The language has grown a little since we first defined it but I hope none of the additions can be regarded as ad hoc fudges.

In my complicated Rel example I save intermediate results for performance reasons.  The technique enables me to compute a set of solutions for a particular example, such as finding all the integers from 1 to 400 that can be expressed by operating on the single-digit integers 1, 2, 3, and 4, in a matter of a few hours instead of perhaps a few days.

I use SAME_TYPE_AS in my relvar declarations so that I can easily repeat all the assignments in case I make a mistake in setting them up.  Originally I was using simultaneous declaration and assignment: VAR such-and-such BASE INIT(rel-exp).

Of course it would be nicer if it could be done the way you suggest.  That's Rel's problem, not the language's.

Hugh

Coauthor of The Third Manifesto and related books.
Quote from Hugh on November 8, 2019, 12:04 pm

Is Tutorial D's failure to support "tagged unions" a consequence of bad language design?  The language has grown a little since we first defined it but I hope none of the additions can be regarded as ad hoc fudges.

Of course, with the IM it does support tagged unions, ignoring that some might not like the way it supports tagged unions.

In my complicated Rel example I save intermediate results for performance reasons.  The technique enables me to compute a set of solutions for a particular example, such as finding all the integers from 1 to 400 that can be expressed by operating on the single-digit integers 1, 2, 3, and 4, in a matter of a few hours instead of perhaps a few days.

I use SAME_TYPE_AS in my relvar declarations so that I can easily repeat all the assignments in case I make a mistake in setting them up.  Originally I was using simultaneous declaration and assignment: VAR such-and-such BASE INIT(rel-exp).

Of course it would be nicer if it could be done the way you suggest.  That's Rel's problem, not the language's.

Ouch.

But it's true. As I often boast, there is no optimisation of any kind, anywhere, except a glancing effort to make JOIN slightly less than ponderously slow.

For its intended purposes, that turns out to be fine.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on November 8, 2019, 1:44 pm
To answer Hugh's q first:
Quote from Hugh on November 8, 2019, 12:04 pm

Is Tutorial D's failure to support "tagged unions" a consequence of bad language design?

Yes, I would say so. In particular "tagged unions" in other languages fall within a more general feature that also supports user-defining enumerated types. And enumerated types are enormously useful in type-safe programming. Without them you co-opt INT or CHAR values to stand for the elements of the enumeration (and presumably in TTM would declare a constraint to limit the set of values). But these are not INTs that you want to do arithmetic on nor CHARs that you want to concatenate, etc.

  The language has grown a little since we first defined it but I hope none of the additions can be regarded as ad hoc fudges.

I don't know whether all the IM features were included when "first defined". The way Dave uses the IM to support tagged unions does look like an ad hoc fudge.

Of course, with the IM it does support tagged unions, ignoring that some might not like the way it supports tagged unions.

There's a crucial difference between the IM way vs the way in other languages with proper support: in other languages, the elements of the type are merely different values so are not distinguishable at type level. You can simply test for equality; if two values have different tags that's type-safe and returns FALSE. The IM way, they're different (sub-)types: now sometimes that doesn't matter; sometimes it's even useful. In testing for equality, having different tags (the equivalent thereof) means different types, and sometimes that means ill-typed -- according to some of the furious debate I've seen on the forum.

 

I use SAME_TYPE_AS in my relvar declarations so that I can easily repeat all the assignments in case I make a mistake in setting them up.  Originally I was using simultaneous declaration and assignment: VAR such-and-such BASE INIT(rel-exp).

Yes I used the same tactic in my Sudoku solver: define the 9 possible values in a base table; use those values to label the columns, rows, boxes. (It would have been much better if I could define those values as an enumeration, not merely INT.) I actually started with a 4 x 4 grid to make sure my logic was right before hitting the combinatorial explosion of 9 x 9. That's in Ms Access, not even SQL. I'll readily agree it was a fudge. I don't see that SAME_TYPE_AS is a distinguishing feature of Tutorial D.

 

 

Quote from AntC on November 8, 2019, 9:51 pm

That's in Ms Access, not even SQL.

Do you mean, not even standard SQL?

The JET database engine in MS Access supports a dialect of SQL.

As an aside, and purely out of curiosity, why did you build a Sudoku solver using MS Access, when a closer-to-standard SQL -- say, PostgreSQL, maybe even MySQL -- would undoubtedly have been less annoying?

I presume the goal was to build one using a database engine, and not a general-purpose language or you would have used one.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Page 1 of 5Next