Which Reality?

#51 · December 7, 2021, 4:24 pm

Quote from Dave Voorhis on December 7, 2021, 1:54 pm

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

Yes. So a JOIN to a PLUS relation (relcon if you prefer that term) is ideally going to ultimately (in the best case, if not all when we consider big integers, rationals etc) resolve down in an implementation to a single machine instruction (a RISC-V add say) for each pair of input "substitute" "scalar" values.

Of course this is strictly just an implementation mater, but if the model and syntax can help the implementation out here, then yes that is a good thing (and would make any prototype simpler to code in an existing language)

#52 · December 7, 2021, 4:58 pm

Quote from Paul Vernon on December 7, 2021, 4:24 pm

Quote from Dave Voorhis on December 7, 2021, 1:54 pm

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

Yes. So a JOIN to a PLUS relation (relcon if you prefer that term) is ideally going to ultimately (in the best case, if not all when we consider big integers, rationals etc) resolve down in an implementation to a single machine instruction (a RISC-V add say) for each pair of input "substitute" "scalar" values.

Of course this is strictly just an implementation mater, but if the model and syntax can help the implementation out here, then yes that is a good thing (and would make any prototype simpler to code in an existing language)

Without having an entirely clear picture of what you have in mind -- both from this and the other related posts on the subject -- it sounds like the idea is that you'll have scalar expressions that look like scalar expressions -- e.g., myAverage = thingTotal / thingCount -- which will actually be internally implemented via JOIN to a divide relation, but that isn't visible and it will compile and optimise down to a machine-level DIV opcode.

In that case, what are you gaining from the JOIN to a divide relation?

Or have I misunderstood what you have in mind?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#53 · December 7, 2021, 7:02 pm

Quote from Paul Vernon on December 7, 2021, 3:48 pm

Quote from Dave Voorhis on December 7, 2021, 2:12 pm

Then the next contentious consideration will be whether it is a statically-checked (i.e., errors checked before run-time) #3 or a dynamically-checked (i.e., no errors checked before run-time) #3.

Errors? Oh, I would not have them. I refer the honourable gentlemen to the answer I gave earlier https://forum.thethirdmanifesto.com/forum/topic/adding-exceptions-to-tutorial-d/?part=2#postid-946082

What I would have, at development, code commit/review/test time and at submit time, would be actionable feedback that might say "Your expression/statement fragment will always [return empty/never return in your lifetime

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

/be semantically really very dubious] given [the database's current constraints/the database's current data/logic/declared semantics]. Is that what you [want/wanted], because that's [what will/is going to/has likely already] happen[ed]?"

or things like "You asserted that the cardinality of the result of this join should be the same cardinality as the left input into the join. I can't prove that will be the case given the current database constraints, but I can see that given the current data (as of the time of this message) your assertion will be true. Up to you if you still want to execute your statement as-is or modify it so that e.g. it does not [update|return] anything if your assertion is not true at run-time."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

How do you intend to handle something like division by zero at run-time?

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#54 · December 7, 2021, 7:44 pm

Quote from Dave Voorhis on December 7, 2021, 7:02 pm

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

Where a cost can't be determined, or where an output cardinality can't be be determined, you would get a comment back along the lines of "I've little to no idea how long your code might run or how many rows it might return. Up to you if you want to run it anyway."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

Yes (or at least it is before - or possibly at - the moment the statement starts executing. Certainly not once it has started - well, maybe you could get messages from some external to the model system that monitors the running of an expression... but it would not affect the query - unless I guess it is a kind of "governor" that can stop badly costed statements from over running)

How do you intend to handle something like division by zero at run-time?

Empty result (i.e. no row for any join to the DIVIDE relation where the 2nd input is a 0 - which BTW would not be implemented as a join (at least not for such specially named things like the DIVIDE relation ;-) ) )

#55 · December 7, 2021, 8:45 pm

Quote from Paul Vernon on December 7, 2021, 4:24 pm

but if the model and syntax can help the implementation out here

There is a particular comment I want to make here, but I want it to be known upfront that being born and raised on the 'engineering' side of things as I have been, even at the age (and wisdom ???) that I have come to I still have my doubts as to the ultimate feasibility/achievability of the expressed ideal : Chris Date fundamentally feels like optimizers should be made good enough so ***THEY DON'T NEED*** the "hints" coming from "model and (***especially not from***) syntax" to get their job done. Maybe Chris Date set out a Hilbert problem that will take more time to solve than the original Hilbert's original twenty ones.

Author of SIRA_PRISE

#56 · December 8, 2021, 3:05 am

Quote from dandl on December 8, 2021, 3:05 am

The question that interests me from the OP is whether there is a different model of computing in which the EE supports symbols rather than values, and the whole concept of a type 'system' becomes moot. As per the brain model I mentioned earlier, each symbol is imbued with a raft of connections which determine how it interacts with operators (which of course are also just symbols).

There is indeed, and it's called symbolic computation. See https://en.wikipedia.org/wiki/Computer_algebra

No, not even close. "Computer algebra is widely used to experiment in mathematics and to design the formulas that are used in numerical programs. " The symbols there are just another data type.

Though with modern desktop computing hardware, it's inevitably emulated on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like.

No, this is back to front. A computer language targets two closely specified abstract machines (Translation and Execution). We implement those machines in silicon and software (using yet other languages). At the language level we are discussing, the hardware has not been 'abstracted away', it simply does not exist and never did.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions". I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

My answer is that from a practical point of view -- i.e., if you want to create a usefully performant, practical tool -- it's probably easier to implement traditional scalar expressions as scalar expressions and make it look like it's manipulating relations (where appropriate), rather than implement scalar expressions by manipulating relations and make them look like scalar expressions.

If we assume a language based on the (E)RA (9 operators plus assignment) it will take a form similar to Codd:

Relation variables (defined with a heading)

Literal relations (with a heading and composed of literal scalar values arranged as tuples)

A library of relcons (each with a known heading)

Operators on those relations.

Scalar expressions are not a requirement, and it will perform just fine.

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

He didn't say "implements", he said "looks like (as far as you can get)" and no, there are no 'opaque' scalar values. This is a language of relations, it's relations all the way down, the expressions are written in a familiar style, and it will perform just fine. Indeed, given the right kind of lower level support taking advantage of parallelism it could easily out-perform 'the usual fashion'.

The question that interests me from the OP is whether there is a different model of computing in which the EE supports symbols rather than values, and the whole concept of a type 'system' becomes moot. As per the brain model I mentioned earlier, each symbol is imbued with a raft of connections which determine how it interacts with operators (which of course are also just symbols).

There is indeed, and it's called symbolic computation. See https://en.wikipedia.org/wiki/Computer_algebra

No, not even close. "Computer algebra is widely used to experiment in mathematics and to design the formulas that are used in numerical programs. " The symbols there are just another data type.

Though with modern desktop computing hardware, it's inevitably emulated on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like.

No, this is back to front. A computer language targets two closely specified abstract machines (Translation and Execution). We implement those machines in silicon and software (using yet other languages). At the language level we are discussing, the hardware has not been 'abstracted away', it simply does not exist and never did.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions". I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

My answer is that from a practical point of view -- i.e., if you want to create a usefully performant, practical tool -- it's probably easier to implement traditional scalar expressions as scalar expressions and make it look like it's manipulating relations (where appropriate), rather than implement scalar expressions by manipulating relations and make them look like scalar expressions.

If we assume a language based on the (E)RA (9 operators plus assignment) it will take a form similar to Codd:

Relation variables (defined with a heading)

Literal relations (with a heading and composed of literal scalar values arranged as tuples)

A library of relcons (each with a known heading)

Operators on those relations.

Scalar expressions are not a requirement, and it will perform just fine.

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

He didn't say "implements", he said "looks like (as far as you can get)" and no, there are no 'opaque' scalar values. This is a language of relations, it's relations all the way down, the expressions are written in a familiar style, and it will perform just fine. Indeed, given the right kind of lower level support taking advantage of parallelism it could easily out-perform 'the usual fashion'.

Andl - A New Database Language - andl.org

#57 · December 8, 2021, 4:29 am

Quote from Paul Vernon on December 7, 2021, 4:24 pm

Quote from Dave Voorhis on December 7, 2021, 1:54 pm

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

Yes. So a JOIN to a PLUS relation (relcon if you prefer that term) is ideally going to ultimately (in the best case, if not all when we consider big integers, rationals etc) resolve down in an implementation to a single machine instruction (a RISC-V add say) for each pair of input "substitute" "scalar" values.

The term 'relcon' is D&D Algebra-A. I would prefer relfun, because they are actually relations that act as functions.

And please get the direction right. Nothing 'resolves down'. The language targets an abstract/virtual machine for its execution, and that machine is built up out of available hardware and software components. I would expect the PLUS relcon implementation to add together two arrays of numbers, which on suitable hardware (perhaps a GPU) is a single parallel operation.

Of course this is strictly just an implementation mater, but if the model and syntax can help the implementation out here, then yes that is a good thing (and would make any prototype simpler to code in an existing language)

The language is done, all bar a choice of syntax: http://www.andl.org/2021/07/formal-definitions-for-an-extended-relational-algebra/.

Please note: it's not Turing Complete (all queries are safe), and it can do most everything in SQL but not ordered queries.

Andl - A New Database Language - andl.org

#58 · December 8, 2021, 4:50 am

Quote from dandl on December 8, 2021, 4:50 am

Quote from Dave Voorhis on December 7, 2021, 7:02 pm

Quote from Paul Vernon on December 7, 2021, 3:48 pm

Quote from Dave Voorhis on December 7, 2021, 2:12 pm

Then the next contentious consideration will be whether it is a statically-checked (i.e., errors checked before run-time) #3 or a dynamically-checked (i.e., no errors checked before run-time) #3.

Errors? Oh, I would not have them. I refer the honourable gentlemen to the answer I gave earlier https://forum.thethirdmanifesto.com/forum/topic/adding-exceptions-to-tutorial-d/?part=2#postid-946082

What I would have, at development, code commit/review/test time and at submit time, would be actionable feedback that might say "Your expression/statement fragment will always [return empty/never return in your lifetime

This is precisely the goal I set out in "higher, shorter, safer". The design of languages such that "if it compiles it runs". No exceptions, no runtime errors, no infinite loops.

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

No so. This is a language design problem such that those programs cannot be written. Every loop or recursive call to be provably terminating. Very different problem.

/be semantically really very dubious] given [the database's current constraints/the database's current data/logic/declared semantics]. Is that what you [want/wanted], because that's [what will/is going to/has likely already] happen[ed]?"

or things like "You asserted that the cardinality of the result of this join should be the same cardinality as the left input into the join. I can't prove that will be the case given the current database constraints, but I can see that given the current data (as of the time of this message) your assertion will be true. Up to you if you still want to execute your statement as-is or modify it so that e.g. it does not [update|return] anything if your assertion is not true at run-time."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

Every language has a translate phase preceding the execute phase, but yes, for 'safer' the presumption is that the translate phase runs to completion and decides whether the program is 'safe' to execute.

How do you intend to handle something like division by zero at run-time?

In a provably safe program, division by zero cannot happen. Mechanisms to achieve that are trivial.

Quote from Dave Voorhis on December 7, 2021, 7:02 pm

Quote from Paul Vernon on December 7, 2021, 3:48 pm

Quote from Dave Voorhis on December 7, 2021, 2:12 pm

Then the next contentious consideration will be whether it is a statically-checked (i.e., errors checked before run-time) #3 or a dynamically-checked (i.e., no errors checked before run-time) #3.

Errors? Oh, I would not have them. I refer the honourable gentlemen to the answer I gave earlier https://forum.thethirdmanifesto.com/forum/topic/adding-exceptions-to-tutorial-d/?part=2#postid-946082

What I would have, at development, code commit/review/test time and at submit time, would be actionable feedback that might say "Your expression/statement fragment will always [return empty/never return in your lifetime

This is precisely the goal I set out in "higher, shorter, safer". The design of languages such that "if it compiles it runs". No exceptions, no runtime errors, no infinite loops.

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

No so. This is a language design problem such that those programs cannot be written. Every loop or recursive call to be provably terminating. Very different problem.

/be semantically really very dubious] given [the database's current constraints/the database's current data/logic/declared semantics]. Is that what you [want/wanted], because that's [what will/is going to/has likely already] happen[ed]?"

or things like "You asserted that the cardinality of the result of this join should be the same cardinality as the left input into the join. I can't prove that will be the case given the current database constraints, but I can see that given the current data (as of the time of this message) your assertion will be true. Up to you if you still want to execute your statement as-is or modify it so that e.g. it does not [update|return] anything if your assertion is not true at run-time."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

Every language has a translate phase preceding the execute phase, but yes, for 'safer' the presumption is that the translate phase runs to completion and decides whether the program is 'safe' to execute.

How do you intend to handle something like division by zero at run-time?

In a provably safe program, division by zero cannot happen. Mechanisms to achieve that are trivial.

Andl - A New Database Language - andl.org

#59 · December 8, 2021, 9:42 am

Quote from dandl on December 8, 2021, 4:50 am

This is precisely the goal I set out in "higher, shorter, safer". The design of languages such that "if it compiles it runs". No exceptions, no runtime errors, no infinite loops.

With the kind of (non) type system I have in mind. I would say that "if it syntactically parses it runs". It is even curious to consider going further any saying that "everything parses", but that's a different thought.

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

No so. This is a language design problem such that those programs cannot be written. Every loop or recursive call to be provably terminating. Very different problem.

The problem with lack of computational completeness is that you then have to bifurcate your world into the "safe but limited" world of data (access), and the "unsafe but unlimited" world of programming.

To keep things as simple as possible, you would want one world (unless that is provably too simple).

As you said in your next post (my bold)

Please note: it's not Turing Complete (all queries are safe), and it can do most everything in SQL but not ordered queries.

I'm afraid that is not good enough

(P.S. by ordered queries, do you mean what D&D call "Quota Queries"?)

#60 · December 8, 2021, 9:42 am

Quote from Dave Voorhis on December 8, 2021, 9:42 am

Quote from dandl on December 8, 2021, 3:05 am

The question that interests me from the OP is whether there is a different model of computing in which the EE supports symbols rather than values, and the whole concept of a type 'system' becomes moot. As per the brain model I mentioned earlier, each symbol is imbued with a raft of connections which determine how it interacts with operators (which of course are also just symbols).

There is indeed, and it's called symbolic computation. See https://en.wikipedia.org/wiki/Computer_algebra

No, not even close. "Computer algebra is widely used to experiment in mathematics and to design the formulas that are used in numerical programs. " The symbols there are just another data type.

Nope, read further. Also known as "computer algebra systems". See https://en.wikipedia.org/wiki/Computer_algebra_system. There's also https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence. Etc.

Though with modern desktop computing hardware, it's inevitably emulated on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like.

No, this is back to front. A computer language targets two closely specified abstract machines (Translation and Execution). We implement those machines in silicon and software (using yet other languages). At the language level we are discussing, the hardware has not been 'abstracted away', it simply does not exist and never did.

Some good molehill mountaineering, here. The original context was a quip (complete with smiley) about physical representation in real machines, appropriate to the context. Whatever theoretical framework might be inferred from a development process, on modern desktop computing hardware, software is built on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like. (Yes, I've copied the above, as it's appropriate here.) This is relevant if you're building a practical tool to run on real (or real virtual) machines.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions". I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

My answer is that from a practical point of view -- i.e., if you want to create a usefully performant, practical tool -- it's probably easier to implement traditional scalar expressions as scalar expressions and make it look like it's manipulating relations (where appropriate), rather than implement scalar expressions by manipulating relations and make them look like scalar expressions.

If we assume a language based on the (E)RA (9 operators plus assignment) it will take a form similar to Codd:

Relation variables (defined with a heading)

Literal relations (with a heading and composed of literal scalar values arranged as tuples)

A library of relcons (each with a known heading)

Operators on those relations.

Scalar expressions are not a requirement, and it will perform just fine.

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

He didn't say "implements", he said "looks like (as far as you can get)" and no, there are no 'opaque' scalar values. This is a language of relations, it's relations all the way down, the expressions are written in a familiar style, and it will perform just fine. Indeed, given the right kind of lower level support taking advantage of parallelism it could easily out-perform 'the usual fashion'.

I think Paul confirmed my interpretation of his post in a subsequent post above.

'Opaque' means 'without type information', again appropriate to the context of the original discussion.

I was speaking of a single scalar expression, e.g., v := p + q, not a vectorisation of some relational operator like EXTEND r {v := p + q} or whatever, though even there I would presume any notional JOIN is optimised away.

Quote from dandl on December 8, 2021, 4:29 am

Quote from Paul Vernon on December 7, 2021, 4:24 pm

Quote from Dave Voorhis on December 7, 2021, 1:54 pm

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

Yes. So a JOIN to a PLUS relation (relcon if you prefer that term) is ideally going to ultimately (in the best case, if not all when we consider big integers, rationals etc) resolve down in an implementation to a single machine instruction (a RISC-V add say) for each pair of input "substitute" "scalar" values.

The term 'relcon' is D&D Algebra-A. I would prefer relfun, because they are actually relations that act as functions.

And please get the direction right. Nothing 'resolves down'. The language targets an abstract/virtual machine for its execution, and that machine is built up out of available hardware and software components. I would expect the PLUS relcon implementation to add together two arrays of numbers, which on suitable hardware (perhaps a GPU) is a single parallel operation.

Of course this is strictly just an implementation mater, but if the model and syntax can help the implementation out here, then yes that is a good thing (and would make any prototype simpler to code in an existing language)

The language is done, all bar a choice of syntax: http://www.andl.org/2021/07/formal-definitions-for-an-extended-relational-algebra/.

Please note: it's not Turing Complete (all queries are safe), and it can do most everything in SQL but not ordered queries.

You should implement it. Of course, the discussion here has been around Paul's proposed/work-in-progress implementation, not yours -- or your particular relational algebra. I'll be interested to see what operators, strategies, semantics and syntax Paul chooses for his implementation, and the discussions here are -- in various ways -- around that.

Quote from dandl on December 8, 2021, 4:50 am

Quote from Dave Voorhis on December 7, 2021, 7:02 pm

Quote from Paul Vernon on December 7, 2021, 3:48 pm

Quote from Dave Voorhis on December 7, 2021, 2:12 pm

Then the next contentious consideration will be whether it is a statically-checked (i.e., errors checked before run-time) #3 or a dynamically-checked (i.e., no errors checked before run-time) #3.

Errors? Oh, I would not have them. I refer the honourable gentlemen to the answer I gave earlier https://forum.thethirdmanifesto.com/forum/topic/adding-exceptions-to-tutorial-d/?part=2#postid-946082

What I would have, at development, code commit/review/test time and at submit time, would be actionable feedback that might say "Your expression/statement fragment will always [return empty/never return in your lifetime

This is precisely the goal I set out in "higher, shorter, safer". The design of languages such that "if it compiles it runs". No exceptions, no runtime errors, no infinite loops.

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

No so. This is a language design problem such that those programs cannot be written. Every loop or recursive call to be provably terminating. Very different problem.

Sure, but my inference from Paul's previous content is that it is intended to be Turing Complete, so that it is potentially an issue. But maybe I've mis-inferred the intent to be Turing Complete; I assume Paul can confirm/deny.

/be semantically really very dubious] given [the database's current constraints/the database's current data/logic/declared semantics]. Is that what you [want/wanted], because that's [what will/is going to/has likely already] happen[ed]?"

or things like "You asserted that the cardinality of the result of this join should be the same cardinality as the left input into the join. I can't prove that will be the case given the current database constraints, but I can see that given the current data (as of the time of this message) your assertion will be true. Up to you if you still want to execute your statement as-is or modify it so that e.g. it does not [update|return] anything if your assertion is not true at run-time."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

Every language has a translate phase preceding the execute phase, but yes, for 'safer' the presumption is that the translate phase runs to completion and decides whether the program is 'safe' to execute.

How do you intend to handle something like division by zero at run-time?

In a provably safe program, division by zero cannot happen. Mechanisms to achieve that are trivial.

Sure, and the question is what mechanism(s) Paul intends to use.

Quote from dandl on December 8, 2021, 3:05 am

The question that interests me from the OP is whether there is a different model of computing in which the EE supports symbols rather than values, and the whole concept of a type 'system' becomes moot. As per the brain model I mentioned earlier, each symbol is imbued with a raft of connections which determine how it interacts with operators (which of course are also just symbols).

There is indeed, and it's called symbolic computation. See https://en.wikipedia.org/wiki/Computer_algebra

No, not even close. "Computer algebra is widely used to experiment in mathematics and to design the formulas that are used in numerical programs. " The symbols there are just another data type.

Nope, read further. Also known as "computer algebra systems". See https://en.wikipedia.org/wiki/Computer_algebra_system. There's also https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence. Etc.

Though with modern desktop computing hardware, it's inevitably emulated on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like.

No, this is back to front. A computer language targets two closely specified abstract machines (Translation and Execution). We implement those machines in silicon and software (using yet other languages). At the language level we are discussing, the hardware has not been 'abstracted away', it simply does not exist and never did.

Some good molehill mountaineering, here. The original context was a quip (complete with smiley) about physical representation in real machines, appropriate to the context. Whatever theoretical framework might be inferred from a development process, on modern desktop computing hardware, software is built on top of systems built out of arrays of bytes/bits/transistor-states/voltages. Obviously, computational hardware abstracts away transistor-states/voltages, so we typically only consider arrays of bits/bytes at the lowest level, though pipeline optimisation may well consider processor architecture and the like. (Yes, I've copied the above, as it's appropriate here.) This is relevant if you're building a practical tool to run on real (or real virtual) machines.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions". I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

My answer is that from a practical point of view -- i.e., if you want to create a usefully performant, practical tool -- it's probably easier to implement traditional scalar expressions as scalar expressions and make it look like it's manipulating relations (where appropriate), rather than implement scalar expressions by manipulating relations and make them look like scalar expressions.

If we assume a language based on the (E)RA (9 operators plus assignment) it will take a form similar to Codd:

Relation variables (defined with a heading)

Literal relations (with a heading and composed of literal scalar values arranged as tuples)

A library of relcons (each with a known heading)

Operators on those relations.

Scalar expressions are not a requirement, and it will perform just fine.

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

He didn't say "implements", he said "looks like (as far as you can get)" and no, there are no 'opaque' scalar values. This is a language of relations, it's relations all the way down, the expressions are written in a familiar style, and it will perform just fine. Indeed, given the right kind of lower level support taking advantage of parallelism it could easily out-perform 'the usual fashion'.

I think Paul confirmed my interpretation of his post in a subsequent post above.

'Opaque' means 'without type information', again appropriate to the context of the original discussion.

I was speaking of a single scalar expression, e.g., v := p + q, not a vectorisation of some relational operator like EXTEND r {v := p + q} or whatever, though even there I would presume any notional JOIN is optimised away.

Quote from dandl on December 8, 2021, 4:29 am

Quote from Paul Vernon on December 7, 2021, 4:24 pm

Quote from Dave Voorhis on December 7, 2021, 1:54 pm

It will perform just fine for those specific operations that it performs -- treating scalar values as opaque -- though if used to implement scalar expressions (which Paul confirmed in a post above, above) it probably won't perform very well compared to implementing scalar expressions in the usual fashion.

Yes. So a JOIN to a PLUS relation (relcon if you prefer that term) is ideally going to ultimately (in the best case, if not all when we consider big integers, rationals etc) resolve down in an implementation to a single machine instruction (a RISC-V add say) for each pair of input "substitute" "scalar" values.

The term 'relcon' is D&D Algebra-A. I would prefer relfun, because they are actually relations that act as functions.

And please get the direction right. Nothing 'resolves down'. The language targets an abstract/virtual machine for its execution, and that machine is built up out of available hardware and software components. I would expect the PLUS relcon implementation to add together two arrays of numbers, which on suitable hardware (perhaps a GPU) is a single parallel operation.

Of course this is strictly just an implementation mater, but if the model and syntax can help the implementation out here, then yes that is a good thing (and would make any prototype simpler to code in an existing language)

The language is done, all bar a choice of syntax: http://www.andl.org/2021/07/formal-definitions-for-an-extended-relational-algebra/.

Please note: it's not Turing Complete (all queries are safe), and it can do most everything in SQL but not ordered queries.

You should implement it. Of course, the discussion here has been around Paul's proposed/work-in-progress implementation, not yours -- or your particular relational algebra. I'll be interested to see what operators, strategies, semantics and syntax Paul chooses for his implementation, and the discussions here are -- in various ways -- around that.

Quote from dandl on December 8, 2021, 4:50 am

Quote from Dave Voorhis on December 7, 2021, 7:02 pm

Quote from Paul Vernon on December 7, 2021, 3:48 pm

Quote from Dave Voorhis on December 7, 2021, 2:12 pm

Then the next contentious consideration will be whether it is a statically-checked (i.e., errors checked before run-time) #3 or a dynamically-checked (i.e., no errors checked before run-time) #3.

Errors? Oh, I would not have them. I refer the honourable gentlemen to the answer I gave earlier https://forum.thethirdmanifesto.com/forum/topic/adding-exceptions-to-tutorial-d/?part=2#postid-946082

What I would have, at development, code commit/review/test time and at submit time, would be actionable feedback that might say "Your expression/statement fragment will always [return empty/never return in your lifetime

This is precisely the goal I set out in "higher, shorter, safer". The design of languages such that "if it compiles it runs". No exceptions, no runtime errors, no infinite loops.

Curious how you'll determine "never return in your lifetime", or generally determine "return empty" for arbitrary code, given it's the Halting Problem.

No so. This is a language design problem such that those programs cannot be written. Every loop or recursive call to be provably terminating. Very different problem.

Sure, but my inference from Paul's previous content is that it is intended to be Turing Complete, so that it is potentially an issue. But maybe I've mis-inferred the intent to be Turing Complete; I assume Paul can confirm/deny.

/be semantically really very dubious] given [the database's current constraints/the database's current data/logic/declared semantics]. Is that what you [want/wanted], because that's [what will/is going to/has likely already] happen[ed]?"

or things like "You asserted that the cardinality of the result of this join should be the same cardinality as the left input into the join. I can't prove that will be the case given the current database constraints, but I can see that given the current data (as of the time of this message) your assertion will be true. Up to you if you still want to execute your statement as-is or modify it so that e.g. it does not [update|return] anything if your assertion is not true at run-time."

Sounds like a pre-execution-time (usually known as compile-time, whether there's actually compilation or not) warning.

Every language has a translate phase preceding the execute phase, but yes, for 'safer' the presumption is that the translate phase runs to completion and decides whether the program is 'safe' to execute.

How do you intend to handle something like division by zero at run-time?

In a provably safe program, division by zero cannot happen. Mechanisms to achieve that are trivial.

Sure, and the question is what mechanism(s) Paul intends to use.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

The Forum for Discussion about The Third Manifesto and Related Matters