The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Life after J

PreviousPage 4 of 5Next
Quote from dandl on April 10, 2021, 6:45 am

 

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples.

Arrant nonsense.

Relations are sets. Some of the relational operators are set operators. But relations are not sets of just anything. I fail to see how you could adequately express the model without charactering the elements of those sets, and for example explaining (in whatever concrete syntax you clothe it):

Again, you're right into implementation detail. A relation is (a) a safer data structure that conforms to certain rules (see implementation details) and (b) an argument to a higher relational operator. The point about safer (details guaranteed by the implementation) is to get to higher (don't think about the implementation).

No. The theoretical definition of a (TTM, at least) relation is that it has a heading and a body, and the body is zero or more tuples.

It may be implemented using a variety of data structures.  "Higher" (in general) is not about dispensing with the theoretical definition, but about not having to consider implementation details, like whether or not a relation has indexes, whether or not they're implemented using B-Trees, and so forth.

TUP{ X 1, Y 'foo' } // }
TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value
TUP{ X 1 } // }
TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value
TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' } // } TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value TUP{ X 1 } // } TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' }     // }
TUP( Y 'foo', X 1 )     // } are duplicates, so not allowed in the same relation value

TUP{ X 1 }              // }
TUP{ X 1, Y 'foo' }     // } are not duplicates, nevertheless are not allowed in the same relation value

TUP{ X 1, X 2 }         // not allowed in any relation value, even though elements of the TUP are not duplicates

The above is implementation detail, ignored in the context of the RA.

The above are implications of the theoretical definition of a (typed TTM) relation.

The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

In what sense is whatever you proposed any sort of RA? How do we for example attach characteristic predicates to relations so that we (or rather users of a database) can tell whether the database content matches the 'mini-world' of the enterprise?

Remember that the operations of the RA are a means to enquire about salient facts and their implications the database content is representing.

A relation exposes a heading, and the business predicate relates the business facts to the content of the relation by means of the heading. The results of an enquiry will always be another relation, with a heading. The implementation will provide a means to convert between relations and other representations, but that's not part of the RA.

The semantics of the relational operators are in terms of headings and (body) tuples.

I'm absolutely serious about this. The only way you can really think about any of this stuff at any level is by not thinking about the levels further down. To really think about relations you have to not think about tuples or values or strings or characters or encodings or bytes (or memory cells or chips or silicon or electrons).

Indeed, if you're manipulating strings you should not have to consider what encoding is used or how many bytes or bits per character or whether the strings are implemented as arrays or ropes or whatever concrete data structure might be behind the string interface(s), but the semantics of string manipulation necessarily describes what happens to characters in strings.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from dandl on April 10, 2021, 1:54 am
Quote from Dave Voorhis on April 7, 2021, 12:03 pm
Quote from dandl on April 7, 2021, 10:23 am

Yes, Rust was the best I found on my quest for M and then S. I'm increasingly certain that until we solve the safety problem for all kinds of programming, we can't move to the next level.

Write a language and scratch that itch. That's what all of us implementers have done. We're a very diverse group, so trying to sell us on a vapourware bullet-point list is only going to spur debate. That's fine such as it is (this is a discussion forum, after all) but if you're looking for some broad consensus or buy-in, it's almost certainly not going to happen. Criticism, though -- you'll get a lot of that.

It's too hard, the hurdle is too high. Andl showed me that, if I didn't know already.

Then I'm not clear where you were/are going with this thread.  Are you not planning to build this post-Java/C#/C++ language?

Or is this a general "we can but dream" discussion about Rust what it might look like?

I thought the point of the discussion was 'life after D'. Given that there are some really good ideas in TTM but the language as specified has failed to gain traction, what kind of language should we strive for.

I started on the theme of M, because C# (and perhaps other GP languages) is nearly good enough to do TTM-alike, but needs compile-time extensions. to implement a genuine extended RA. My theme was: shorter, safer, higher. As I worked through the arguments I came to realise that safer comes first: you can't do higher if you're worried about null pointers and exceptions. Higher automatically leads to shorter. You can't use text macros to do shorter because you lose safer. It has to be safer->higher->shorter. M is part of it, but not the driver.

On my current understanding Rust is the closest, but on current indications their focus is more an Ada/C++ replacement than a Java/C#/Python replacement and perhaps not well suited as a D. Finding out is a project.

So now I need a project to try it out, and perhaps implementing the Extended RA is one worth trying. One thing Andl taught me was: I really miss programming with relations!

Is it the relations and relational algebra in particular -- no duplicate tuples, the particular relational operators and such -- that are appealing?

Or is it programming by writing expressions to transform input to output via immutable arguments and return values, and a set of composable operators?

Whilst I appreciate the relational model (of course), for me it's the latter. The relational model is just one example of the benefit of expressing (certain) programs using stateless transformations, but there are others: C# LINQ; Java Streams; various vector/matrix libraries; even bash scripting using find, sed, awk, cut, grep as operators with pipe to pass output from one to the next.

It's the ability to use a higher mental model. It's the same movement as from spaghetti code to structured programming, from explicit for loops to foreach, from loops to streams. It's being able to think of a relation as a single entity, not a list of rows or a stream of tuples. It's being able to think: if I joined this to that and projected it onto the other then it would have this shape and it fit into that need. I had that feeling with arrays in APL, and it's a rare feeling. It's not a pipe so much as a production line of whole assemblies built out of components.

Exactly. I get that feeling from using Rel. I also get that feeling from C# LINQ; Java Streams; various vector/matrix libraries; bash scripting using find, sed, awk, cut, grep; and strings and string operators in various languages.

Only I don't think of a list of rows or a stream of tuples. It's a container, a collection, a matrix, an array, a text stream, a string, or a relation.

In short, it's values and operators that take values as arguments and return a value.

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples. The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

Dealing with possible null values or exceptions tends to be what complicates most value/operator systems like Java Streams and C# LINQ, and for that matter, strings and string operators, matrix libraries in the usual popular languages, etc. There are usually mechanisms for making these somewhat manageable, though had there never been null it would perhaps have generally been easier.

I agree: that's the safer step I've been talking about, but higher comes after that.

I've used all those tools, and almost all of them require code written at the row level: a regex, or a tuple expression or similar. You aspire to think higher, but the code you write is row level. Matrix libraries and APL, set operations, the pure RA are the exception in that they work on the whole thing ('closed over relation'), but most of those things work on rows.

The relational model is also "work on rows" (or "work on tuples"), in a simplistic sense. String operators "work on characters", and so on. For all such systems, you can either view them as values and operators on values, or view them as complex structures and operators on components of complex structures. To effectively use them, we generally understand their semantics as being both. That's the case whether we're considering the relational model, C# LINQ, Java Streams, strings and string operators, linear algebra systems, and numerous other implementations of the essential values/operators idea.

I'm talking mental model, not implementation. Your "work on" is implementation, my "think about" is the abstraction, the mental model. If we want to operate on strings we should do so with string operators (leaving the character nasties to the implementor). It's not safer to think about both, it's safer to think at the higher level and have the implementation guarantee that it works right at the lower level.

Case in point: text processing. The mental model is (should be): a text object and operators on it. We are aware that it consists of strings and delimiters which in turn are characters and bytes, but we don't want to think about that and it seems we can't avoid it: grep and the Unix shell tools force us to think at the level of strings (lines). We write some code, it doesn't work, now we find it used the wrong CRLF convention so we're back into characters. To have operators on text objects we first need safer, so strings and characters can be ignored. The the mental model is no longer 'read lines, do something to each, write lines' but instead it's 'read text, apply operators, write text'. That higher.

The Sudoku solver I wrote in Andl showed me what might be possible, but it's very different from Linq and pipelines. I'm currently working with a data model of 7 relations, but you wouldn't know that from the code.

The next step is Prolog, of course. See https://www.swi-prolog.org/pldoc/man?section=clpfd-sudoku

I did one of those when I was learning Turbo Prolog, but it wasn't fun. The mental model is too different and it seemed to be all about where to put the cut operator. I don't think I ever got to the stage where I could have written that, but the Andl one does exactly the same thing in about 50 lines of code.

Cut should be avoided.

See "Blub Paradox."

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on April 10, 2021, 3:47 pm
Quote from dandl on April 10, 2021, 6:45 am

 

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples.

Arrant nonsense.

Relations are sets. Some of the relational operators are set operators. But relations are not sets of just anything. I fail to see how you could adequately express the model without charactering the elements of those sets, and for example explaining (in whatever concrete syntax you clothe it):

Again, you're right into implementation detail. A relation is (a) a safer data structure that conforms to certain rules (see implementation details) and (b) an argument to a higher relational operator. The point about safer (details guaranteed by the implementation) is to get to higher (don't think about the implementation).

No. The theoretical definition of a (TTM, at least) relation is that it has a heading and a body, and the body is zero or more tuples.

It may be implemented using a variety of data structures.  "Higher" (in general) is not about dispensing with the theoretical definition, but about not having to consider implementation details, like whether or not a relation has indexes, whether or not they're implemented using B-Trees, and so forth.

I have no argument with  that. And tuples are a necessity in specifying the internals of the RA. They just don't need to play any role in the implementation or in a language that implements the RA. I think TTM does a serious disservice by defining a tuple type, which turns out to be pretty useless. Indeed, tuple types are the only reason for the oft-mentioned difficulty in reconciling with records as per most languages. If you have no tuples, the problem goes away. Poof!

TUP{ X 1, Y 'foo' } // }
TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value
TUP{ X 1 } // }
TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value
TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' } // } TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value TUP{ X 1 } // } TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' }     // }
TUP( Y 'foo', X 1 )     // } are duplicates, so not allowed in the same relation value

TUP{ X 1 }              // }
TUP{ X 1, Y 'foo' }     // } are not duplicates, nevertheless are not allowed in the same relation value

TUP{ X 1, X 2 }         // not allowed in any relation value, even though elements of the TUP are not duplicates

The above is implementation detail, ignored in the context of the RA.

The above are implications of the theoretical definition of a (typed TTM) relation.

Precisely -- they are mandated by TTM/TD and exist for no other reason.

The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

In what sense is whatever you proposed any sort of RA? How do we for example attach characteristic predicates to relations so that we (or rather users of a database) can tell whether the database content matches the 'mini-world' of the enterprise?

Remember that the operations of the RA are a means to enquire about salient facts and their implications the database content is representing.

A relation exposes a heading, and the business predicate relates the business facts to the content of the relation by means of the heading. The results of an enquiry will always be another relation, with a heading. The implementation will provide a means to convert between relations and other representations, but that's not part of the RA.

The semantics of the relational operators are in terms of headings and (body) tuples.

Again, this is true at a definitional and implementation level but is never visible in the RA.

I'm absolutely serious about this. The only way you can really think about any of this stuff at any level is by not thinking about the levels further down. To really think about relations you have to not think about tuples or values or strings or characters or encodings or bytes (or memory cells or chips or silicon or electrons).

Indeed, if you're manipulating strings you should not have to consider what encoding is used or how many bytes or bits per character or whether the strings are implemented as arrays or ropes or whatever concrete data structure might be behind the string interface(s), but the semantics of string manipulation necessarily describes what happens to characters in strings.

Implementation detail again. To go safer and higher you have to be able to ignore detail at the lower level. This is non-optional -- it is the only way.

Andl - A New Database Language - andl.org
Quote from AntC on April 10, 2021, 7:13 am
Quote from dandl on April 10, 2021, 6:45 am

 

I'm absolutely serious about this. The only way you can really think about any of this stuff ...

I see no evidence you're thinking. This is a wild amorphous "vapourware bullet-point list ". Any debate is going to be like trying to nail jello to a wall, because you'll just wriggle away claiming any critique is not being 'high level' enough.

D&D have thought about "this stuff". Of course they're not the only who have. Of course you're entitled to disagree with their specifics. But then you must come up with an alternative to the same level of detail as their specifics. Until then your criticisms are hot air.

If you don't like relations being specified in terms of tuples, nor RA operations be expressed in terms of effects on tuples (per Appendix A and/or HH&T 1975), provide an alternative specification. Or stop using 'relations' or 'RA' -- because again you're using a private language containing terms that seem familiar/carry a familiar connotation but denote something different/so far making no sense.

Then you completely miss the point. Relations and the RA are specified (as per Codd) in terms of tuples and operations on sets of tuples. TTM included the named perspective and headings. Then they did the best they could at the time, they added a type system with both relations and tuples. This was wrong. There is absolutely no need to expose tuples as a type, and by doing so they trapped their pet language forever in thinking at the tuple level. Nobody needs a tuple.

I have fully specified an Extended RA with 9 operators (the 6 you know plus newvalue, aggregate and while). It also requires two starting values (DEE and DUM). With this algebra you can create any relational value and perform any query without ever going anywhere near a tuple. This is safer and higher. It also supports a stronger form of RM Pro 7: D shall support no tuple level operations on relvars or relations.

As an aside, TD would work better if it threw out tuples and instead implemented selectors for relations in terms of records (aka POSSREPS). Too late now.

Andl - A New Database Language - andl.org
Quote from dandl on April 11, 2021, 12:35 am
Quote from Dave Voorhis on April 10, 2021, 3:47 pm
Quote from dandl on April 10, 2021, 6:45 am

 

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples.

Arrant nonsense.

Relations are sets. Some of the relational operators are set operators. But relations are not sets of just anything. I fail to see how you could adequately express the model without charactering the elements of those sets, and for example explaining (in whatever concrete syntax you clothe it):

Again, you're right into implementation detail. A relation is (a) a safer data structure that conforms to certain rules (see implementation details) and (b) an argument to a higher relational operator. The point about safer (details guaranteed by the implementation) is to get to higher (don't think about the implementation).

No. The theoretical definition of a (TTM, at least) relation is that it has a heading and a body, and the body is zero or more tuples.

It may be implemented using a variety of data structures.  "Higher" (in general) is not about dispensing with the theoretical definition, but about not having to consider implementation details, like whether or not a relation has indexes, whether or not they're implemented using B-Trees, and so forth.

I have no argument with  that. And tuples are a necessity in specifying the internals of the RA. They just don't need to play any role in the implementation or in a language that implements the RA. I think TTM does a serious disservice by defining a tuple type, which turns out to be pretty useless. Indeed, tuple types are the only reason for the oft-mentioned difficulty in reconciling with records as per most languages. If you have no tuples, the problem goes away. Poof!

No, you still have relation headings/types, and you run into exactly the same issues even if you don't explicitly expose a 'tuple' construct.

TUP{ X 1, Y 'foo' } // }
TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value
TUP{ X 1 } // }
TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value
TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' } // } TUP( Y 'foo', X 1 ) // } are duplicates, so not allowed in the same relation value TUP{ X 1 } // } TUP{ X 1, Y 'foo' } // } are not duplicates, nevertheless are not allowed in the same relation value TUP{ X 1, X 2 } // not allowed in any relation value, even though elements of the TUP are not duplicates
TUP{ X 1, Y 'foo' }     // }
TUP( Y 'foo', X 1 )     // } are duplicates, so not allowed in the same relation value

TUP{ X 1 }              // }
TUP{ X 1, Y 'foo' }     // } are not duplicates, nevertheless are not allowed in the same relation value

TUP{ X 1, X 2 }         // not allowed in any relation value, even though elements of the TUP are not duplicates

The above is implementation detail, ignored in the context of the RA.

The above are implications of the theoretical definition of a (typed TTM) relation.

Precisely -- they are mandated by TTM/TD and exist for no other reason.

They are mandated by TTM because a relation without tuples is notionally "just" a set. Set theoretical languages are interesting -- see SETL (which partly inspired Python!) and D L Childs's extended set theory, etc -- but they're not a relational model.

The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

In what sense is whatever you proposed any sort of RA? How do we for example attach characteristic predicates to relations so that we (or rather users of a database) can tell whether the database content matches the 'mini-world' of the enterprise?

Remember that the operations of the RA are a means to enquire about salient facts and their implications the database content is representing.

A relation exposes a heading, and the business predicate relates the business facts to the content of the relation by means of the heading. The results of an enquiry will always be another relation, with a heading. The implementation will provide a means to convert between relations and other representations, but that's not part of the RA.

The semantics of the relational operators are in terms of headings and (body) tuples.

Again, this is true at a definitional and implementation level but is never visible in the RA.

It's fundamental to the semantics of relational algebras. There's no requirement to expose tuples or tuple types as self-standing constructs in an implementation, but they're unavoidably essential to the semantics of relational algebra operations, and the definition of relations themselves.

I'm absolutely serious about this. The only way you can really think about any of this stuff at any level is by not thinking about the levels further down. To really think about relations you have to not think about tuples or values or strings or characters or encodings or bytes (or memory cells or chips or silicon or electrons).

Indeed, if you're manipulating strings you should not have to consider what encoding is used or how many bytes or bits per character or whether the strings are implemented as arrays or ropes or whatever concrete data structure might be behind the string interface(s), but the semantics of string manipulation necessarily describes what happens to characters in strings.

Implementation detail again. To go safer and higher you have to be able to ignore detail at the lower level. This is non-optional -- it is the only way.

No, you're conflating semantic foundations and conceptual abstractions with elided implementation details.

You can, of course, abstract string operations so that you have some construct S with algebraic operations appendinsert, at and so forth, and have it contain elements. That is more abstract (what you call "higher", I guess) than a string of characters -- you can then call it a dynamic array or whatever -- but if you use it on characters then it is semantically a string and its elements are characters.

Taking such abstractions to their utmost extent you wind up (in practical terms) with nothing but values and operators -- which is fine and much can be done with that, obviously -- but it's only useful in terms of the constructs defined with it: strings that contain characters (and string operations in terms of characters), arrays that contain elements (and array operations in terms of elements), and relations that contain tuples (and relational algebra operators in terms of tuples.)

Some of these things may be defined in terms of others, but each such definition is only meaningful in terms of its semantics (by definition), so you can't simply discard tuples from relations and claim it's "higher relations" (or something), when what you've created is a actually a semantically different entity.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on April 11, 2021, 9:38 am
Quote from dandl on April 11, 2021, 12:35 am
Quote from Dave Voorhis on April 10, 2021, 3:47 pm
Quote from dandl on April 10, 2021, 6:45 am

 

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples.

Arrant nonsense.

Relations are sets. Some of the relational operators are set operators. But relations are not sets of just anything. I fail to see how you could adequately express the model without charactering the elements of those sets, and for example explaining (in whatever concrete syntax you clothe it):

Again, you're right into implementation detail. A relation is (a) a safer data structure that conforms to certain rules (see implementation details) and (b) an argument to a higher relational operator. The point about safer (details guaranteed by the implementation) is to get to higher (don't think about the implementation).

No. The theoretical definition of a (TTM, at least) relation is that it has a heading and a body, and the body is zero or more tuples.

It may be implemented using a variety of data structures.  "Higher" (in general) is not about dispensing with the theoretical definition, but about not having to consider implementation details, like whether or not a relation has indexes, whether or not they're implemented using B-Trees, and so forth.

I have no argument with  that. And tuples are a necessity in specifying the internals of the RA. They just don't need to play any role in the implementation or in a language that implements the RA. I think TTM does a serious disservice by defining a tuple type, which turns out to be pretty useless. Indeed, tuple types are the only reason for the oft-mentioned difficulty in reconciling with records as per most languages. If you have no tuples, the problem goes away. Poof!

No, you still have relation headings/types, and you run into exactly the same issues even if you don't explicitly expose a 'tuple' construct.

I don't think so. A relation sees no types, it just exposes a heading. There are no type-related operations in the RA, it's a fiction grafted on by TTM to create a language. You can think about relations and the operations on them perfectly well and remain ignorant of all the details.

It matters when you want to assert a new fact: the step of creating new values necessarily must respect type constraints. And it matter when you want to retrieve the result of a query in a non-relational form. In between, no tuples and no types.

Precisely -- they are mandated by TTM/TD and exist for no other reason.

They are mandated by TTM because a relation without tuples is notionally "just" a set. Set theoretical languages are interesting -- see SETL (which partly inspired Python!) and D L Childs's extended set theory, etc -- but they're not a relational model.

That makes no sense. The topic of discussion is relations per Codd and the RA that operates on them. Sets are off-topic.

The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

In what sense is whatever you proposed any sort of RA? How do we for example attach characteristic predicates to relations so that we (or rather users of a database) can tell whether the database content matches the 'mini-world' of the enterprise?

Remember that the operations of the RA are a means to enquire about salient facts and their implications the database content is representing.

A relation exposes a heading, and the business predicate relates the business facts to the content of the relation by means of the heading. The results of an enquiry will always be another relation, with a heading. The implementation will provide a means to convert between relations and other representations, but that's not part of the RA.

The semantics of the relational operators are in terms of headings and (body) tuples.

Again, this is true at a definitional and implementation level but is never visible in the RA.

It's fundamental to the semantics of relational algebras. There's no requirement to expose tuples or tuple types as self-standing constructs in an implementation, but they're unavoidably essential to the semantics of relational algebra operations, and the definition of relations themselves.

We're going round in circles. As I said, relations and the RA are defined in terms of sets of tuples. They're implemented in a way that respects tuples. But we must then think about relations and the RA in a way that ignores tuples, or we have no way to safer and higher. The tuple specific features of TTM and TD are a serious and unnecessary obstruction. They're like individual bytes intruding into thinking about (Unicode) text, or floating point numbers intruding into thinking about matrix manipulation. The power of higher level thinking is know what you can ignore.

I'm absolutely serious about this. The only way you can really think about any of this stuff at any level is by not thinking about the levels further down. To really think about relations you have to not think about tuples or values or strings or characters or encodings or bytes (or memory cells or chips or silicon or electrons).

Indeed, if you're manipulating strings you should not have to consider what encoding is used or how many bytes or bits per character or whether the strings are implemented as arrays or ropes or whatever concrete data structure might be behind the string interface(s), but the semantics of string manipulation necessarily describes what happens to characters in strings.

Implementation detail again. To go safer and higher you have to be able to ignore detail at the lower level. This is non-optional -- it is the only way.

No, you're conflating semantic foundations and conceptual abstractions with elided implementation details.

You can, of course, abstract string operations so that you have some construct S with algebraic operations appendinsert, at and so forth, and have it contain elements. That is more abstract (what you call "higher", I guess) than a string of characters -- you can then call it a dynamic array or whatever -- but if you use it on characters then it is semantically a string and its elements are characters.

No, that's not really it. Construct T (for text) has operations to append, insert, at of course, but the arguments are text objects.  No strings, no characters (except in the definitions and implementation).

I found a patent: https://patents.google.com/patent/US5859636A/en, but there isn't much out there but it has some ideas.

  • You can query a text object for telephone numbers, dates, names, addresses.
  • You can make a word frequency table.
  • You can reformat it in various ways.

But you can't do any of this if you have to keep worrying about the encoding or the line endings or the embedded formatting.

Taking such abstractions to their utmost extent you wind up (in practical terms) with nothing but values and operators -- which is fine and much can be done with that, obviously -- but it's only useful in terms of the constructs defined with it: strings that contain characters (and string operations in terms of characters), arrays that contain elements (and array operations in terms of elements), and relations that contain tuples (and relational algebra operators in terms of tuples.)

Some of these things may be defined in terms of others, but each such definition is only meaningful in terms of its semantics (by definition), so you can't simply discard tuples from relations and claim it's "higher relations" (or something), when what you've created is a actually a semantically different entity.

No, the semantics is unchanged. Same definition, same RA. So the higher I want to think about is how to compose relations. If I have a set of 10 or 100 relations that form a data model, how do I think about the relationships between relations? All I really have to go on is the headings (and the corresponding predicates), so what are the higher level operations that would allow me to perform safe operations on the data model as a whole? You know how to do that the hard way, in SQL using the catalog, but where is the abstraction for that?

Andl - A New Database Language - andl.org
Quote from dandl on April 11, 2021, 10:56 am
Quote from Dave Voorhis on April 11, 2021, 9:38 am
Quote from dandl on April 11, 2021, 12:35 am
Quote from Dave Voorhis on April 10, 2021, 3:47 pm
Quote from dandl on April 10, 2021, 6:45 am

 

That's the aspiration, but again, you can't think higher while you're still concerned with safety, and as long as you think container, it's still about the rows.

No more or less than "as long as you think relation, it's still about the tuples."

Well, no. The RA is fully defined over relations, with nary a tuple in sight. One of the serious flaws in TD is that it embeds tuple-notation into its version of RA, which breaks the model. Algebra A showed us a way to express selection and new values as relational operators, but again broke the model by expressing relcons as tuples.

Arrant nonsense.

Relations are sets. Some of the relational operators are set operators. But relations are not sets of just anything. I fail to see how you could adequately express the model without charactering the elements of those sets, and for example explaining (in whatever concrete syntax you clothe it):

Again, you're right into implementation detail. A relation is (a) a safer data structure that conforms to certain rules (see implementation details) and (b) an argument to a higher relational operator. The point about safer (details guaranteed by the implementation) is to get to higher (don't think about the implementation).

No. The theoretical definition of a (TTM, at least) relation is that it has a heading and a body, and the body is zero or more tuples.

It may be implemented using a variety of data structures.  "Higher" (in general) is not about dispensing with the theoretical definition, but about not having to consider implementation details, like whether or not a relation has indexes, whether or not they're implemented using B-Trees, and so forth.

I have no argument with  that. And tuples are a necessity in specifying the internals of the RA. They just don't need to play any role in the implementation or in a language that implements the RA. I think TTM does a serious disservice by defining a tuple type, which turns out to be pretty useless. Indeed, tuple types are the only reason for the oft-mentioned difficulty in reconciling with records as per most languages. If you have no tuples, the problem goes away. Poof!

No, you still have relation headings/types, and you run into exactly the same issues even if you don't explicitly expose a 'tuple' construct.

I don't think so. A relation sees no types, it just exposes a heading. There are no type-related operations in the RA, it's a fiction grafted on by TTM to create a language. You can think about relations and the operations on them perfectly well and remain ignorant of all the details.

It matters when you want to assert a new fact: the step of creating new values necessarily must respect type constraints. And it matter when you want to retrieve the result of a query in a non-relational form. In between, no tuples and no types.

Precisely -- they are mandated by TTM/TD and exist for no other reason.

They are mandated by TTM because a relation without tuples is notionally "just" a set. Set theoretical languages are interesting -- see SETL (which partly inspired Python!) and D L Childs's extended set theory, etc -- but they're not a relational model.

That makes no sense. The topic of discussion is relations per Codd and the RA that operates on them. Sets are off-topic.

The extended RA I proposed has no tuples, anywhere. It has headings (for projection and rename) and it has functions (for selection, new values and aggregation) but absolutely no tuples. Yes, you need some kind of syntax for literals but that's a language choice and that doesn't have to be tuples either.

In what sense is whatever you proposed any sort of RA? How do we for example attach characteristic predicates to relations so that we (or rather users of a database) can tell whether the database content matches the 'mini-world' of the enterprise?

Remember that the operations of the RA are a means to enquire about salient facts and their implications the database content is representing.

A relation exposes a heading, and the business predicate relates the business facts to the content of the relation by means of the heading. The results of an enquiry will always be another relation, with a heading. The implementation will provide a means to convert between relations and other representations, but that's not part of the RA.

The semantics of the relational operators are in terms of headings and (body) tuples.

Again, this is true at a definitional and implementation level but is never visible in the RA.

It's fundamental to the semantics of relational algebras. There's no requirement to expose tuples or tuple types as self-standing constructs in an implementation, but they're unavoidably essential to the semantics of relational algebra operations, and the definition of relations themselves.

We're going round in circles. As I said, relations and the RA are defined in terms of sets of tuples. They're implemented in a way that respects tuples. But we must then think about relations and the RA in a way that ignores tuples, or we have no way to safer and higher. The tuple specific features of TTM and TD are a serious and unnecessary obstruction. They're like individual bytes intruding into thinking about (Unicode) text, or floating point numbers intruding into thinking about matrix manipulation. The power of higher level thinking is know what you can ignore.

I'm absolutely serious about this. The only way you can really think about any of this stuff at any level is by not thinking about the levels further down. To really think about relations you have to not think about tuples or values or strings or characters or encodings or bytes (or memory cells or chips or silicon or electrons).

Indeed, if you're manipulating strings you should not have to consider what encoding is used or how many bytes or bits per character or whether the strings are implemented as arrays or ropes or whatever concrete data structure might be behind the string interface(s), but the semantics of string manipulation necessarily describes what happens to characters in strings.

Implementation detail again. To go safer and higher you have to be able to ignore detail at the lower level. This is non-optional -- it is the only way.

No, you're conflating semantic foundations and conceptual abstractions with elided implementation details.

You can, of course, abstract string operations so that you have some construct S with algebraic operations appendinsert, at and so forth, and have it contain elements. That is more abstract (what you call "higher", I guess) than a string of characters -- you can then call it a dynamic array or whatever -- but if you use it on characters then it is semantically a string and its elements are characters.

No, that's not really it. Construct T (for text) has operations to append, insert, at of course, but the arguments are text objects.  No strings, no characters (except in the definitions and implementation).

I found a patent: https://patents.google.com/patent/US5859636A/en, but there isn't much out there but it has some ideas.

  • You can query a text object for telephone numbers, dates, names, addresses.
  • You can make a word frequency table.
  • You can reformat it in various ways.

But you can't do any of this if you have to keep worrying about the encoding or the line endings or the embedded formatting.

Taking such abstractions to their utmost extent you wind up (in practical terms) with nothing but values and operators -- which is fine and much can be done with that, obviously -- but it's only useful in terms of the constructs defined with it: strings that contain characters (and string operations in terms of characters), arrays that contain elements (and array operations in terms of elements), and relations that contain tuples (and relational algebra operators in terms of tuples.)

Some of these things may be defined in terms of others, but each such definition is only meaningful in terms of its semantics (by definition), so you can't simply discard tuples from relations and claim it's "higher relations" (or something), when what you've created is a actually a semantically different entity.

No, the semantics is unchanged. Same definition, same RA. So the higher I want to think about is how to compose relations. If I have a set of 10 or 100 relations that form a data model, how do I think about the relationships between relations? All I really have to go on is the headings (and the corresponding predicates), so what are the higher level operations that would allow me to perform safe operations on the data model as a whole? You know how to do that the hard way, in SQL using the catalog, but where is the abstraction for that?

I'm afraid you've lost me here. It's not clear what point you're trying to make.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

I'm afraid you've lost me here. It's not clear what point you're trying to make.

The point is simple enough. Rather than adding more detail, more complexity, more features to the GP languages we have:

  • Make languages that are safer by removing lower level concerns such nulls, exceptions, casts and so on (which the compiler can deal with)
  • Then make them higher by removing implementation detail such as bits, bytes, characters and in due course strings, tuples, arrays
  • With the aim of making them shorter, so that less code does more work.

These are things many people have tried to do, and always got stuck because the things that got left out turned out to be needed and the X chunk couldn't talk to the Y chunk. But I do see the time getting closer. We're so used to our close-to-the-metal programming paradigm we think it has to be that way, but now when I write C++ I now realise how truly terrible it is to have to stoop so low. I credit Linq and Andl for seeing hints of how it might be.

Andl - A New Database Language - andl.org
Quote from dandl on April 11, 2021, 2:47 pm

I'm afraid you've lost me here. It's not clear what point you're trying to make.

The point is simple enough. Rather than adding more detail, more complexity, more features to the GP languages we have:

  • Make languages that are safer by removing lower level concerns such nulls, exceptions, casts and so on (which the compiler can deal with)
  • Then make them higher by removing implementation detail such as bits, bytes, characters and in due course strings, tuples, arrays
  • With the aim of making them shorter, so that less code does more work.

These are things many people have tried to do, and always got stuck because the things that got left out turned out to be needed and the X chunk couldn't talk to the Y chunk. But I do see the time getting closer. We're so used to our close-to-the-metal programming paradigm we think it has to be that way, but now when I write C++ I now realise how truly terrible it is to have to stoop so low. I credit Linq and Andl for seeing hints of how it might be.

I'm still entirely baffled by where you're going with all this, but...

LINQ is a halfway implementation of functional programming, limited to certain mechanisms, constrained by the limitations of an otherwise imperative language. To really see how far the notion can (and perhaps should) be taken, look at Haskell, ML, F#, Lisp, and (in a different direction) Prolog.

Re safer, removing nulls is good, though somewhat complicated by conventional programming languages that embed null in the fundamental language semantics. The rest are -- or should be -- only at the closest-to-the-metal layer and abstracted away above it. I can't remember the last time I used a cast at an application level in Java, and exceptions should be for things that are truly exceptional -- like the network or disk drive goes away -- to allow controlled exit or shutdown, not how they're sometimes used as a (bad) way to return additional values from a method call.

Your higher and shorter appear to be your personal discovery of abstraction and procedures. It seems unlikely that you've only just discovered these, but perhaps your understanding of their application and meaning is new?

I'm not seeing that so far, though. I haven't seen anything to suggest anything but conventional interpretations of abstraction and procedures.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on April 11, 2021, 3:46 pm
Quote from dandl on April 11, 2021, 2:47 pm

I'm afraid you've lost me here. It's not clear what point you're trying to make.

The point is simple enough. Rather than adding more detail, more complexity, more features to the GP languages we have:

  • Make languages that are safer by removing lower level concerns such nulls, exceptions, casts and so on (which the compiler can deal with)
  • Then make them higher by removing implementation detail such as bits, bytes, characters and in due course strings, tuples, arrays
  • With the aim of making them shorter, so that less code does more work.

These are things many people have tried to do, and always got stuck because the things that got left out turned out to be needed and the X chunk couldn't talk to the Y chunk. But I do see the time getting closer. We're so used to our close-to-the-metal programming paradigm we think it has to be that way, but now when I write C++ I now realise how truly terrible it is to have to stoop so low. I credit Linq and Andl for seeing hints of how it might be.

I'm still entirely baffled by where you're going with all this, but...

LINQ is a halfway implementation of functional programming, limited to certain mechanisms, constrained by the limitations of an otherwise imperative language. To really see how far the notion can (and perhaps should) be taken, look at Haskell, ML, F#, Lisp, and (in a different direction) Prolog.

No, I already knew all that stuff and used all those languages. What Linq showed me was the direct reduction in code complexity and ability to think at a higher level within a language that was otherwise unchanged. [In the same way my recent trip back into C++ showed me all the horrible gritty detail that Java/C# already omitted.] In other respects C# is problematic: Linq required several additions to the language, so the overall cognitive load is higher, and that's also my experience with Haskell, Lisp and Prolog. The languages are complex and powerful, and still involve dealing with a lot of low level concerns.

Re safer, removing nulls is good, though somewhat complicated by conventional programming languages that embed null in the fundamental language semantics. The rest are -- or should be -- only at the closest-to-the-metal layer and abstracted away above it. I can't remember the last time I used a cast at an application level in Java, and exceptions should be for things that are truly exceptional -- like the network or disk drive goes away -- to allow controlled exit or shutdown, not how they're sometimes used as a (bad) way to return additional values from a method call.

Your higher and shorter appear to be your personal discovery of abstraction and procedures. It seems unlikely that you've only just discovered these, but perhaps your understanding of their application and meaning is new?

I'm not seeing that so far, though. I haven't seen anything to suggest anything but conventional interpretations of abstraction and procedures.

I won't argue with your personal experience with Java. Mine is different, but you know where the dragons are and how to avoid them. What I'm proposing is most certainly abstraction, but procedures are not enough. You can't add streams to Java without first adding lambdas, you have to change the core of the language. You can't add TTM relations to Java just by writing a few procedures, the compiler won't check headings for you. And you can't even properly think about higher until after safer frees you from the grunge. Which I think is where you are now: up to your armpits in Java grunge.

 

Andl - A New Database Language - andl.org
PreviousPage 4 of 5Next