The Forum for Discussion about The Third Manifesto and Related Matters

Please or Register to create posts and topics.

Which Reality?

PreviousPage 4 of 7Next
Quote from Paul Vernon on December 5, 2021, 10:41 am
Quote from Erwin on December 5, 2021, 1:13 am

Attribute names we only have when there are relations.  "Type safety" in programming languages / compilers seems to me to be exactly about how to retain those very semantics when there are no relations in sight, as in pieces of code that just do some scalar computation.

That is all true. However as per the Information Principle, all we (should) have are relations. Why do we hanker after what the Jones have? We don't need variables (plural), we just need our one database variable and that holds tuples with attributes with attributes names. Why do we need "scalar computation" when we have our relational operators (and functions as relations)?

Per the Information Principle, everything in the database is values in tables. That doesn't preclude having expressions that compute values to put in tables, nor does it preclude -- and it probably should be the case, simply for ergonomics' sake -- that even if all you have are relations, if you want to calculate something that a traditional language would handle with a traditional scalar expression, then you probably want it to at least look like a traditional scalar expression. And if that's the case, you probably want to provide something that at least looks like type safety, so that the system at least warns you if you attempt to -- for example -- inadvertently (or even intentionally, I suppose) divide today's date by an invoice number.

As an aside, I'm not aware of any programming languages where variable names are not fully arbitrary. I.e. I don't think there is a programming language that would complain if you create a variable named say SalesDate but made it of type TIMESTAMP, or a variable named i but put character values in it. I would be interested in counter examples. I'll take maybe a good example of "linting" or other external conformance checking, but would be really interested in any language example that enforced semantic variable naming "conventions"

As AntC mentioned, there's Fortran. There's Hungarian Notation, which poisoned the programming world for a while though usually as a form of documentation rather than type enforcement. Some dialects of classic BASIC used sigils to denote type.

I dimly recall that Executable English (see http://executable-english.com) has some form of semantic naming but maybe I'm thinking of something else, like Plain English (see https://osmosianplainenglishprogramming.blog).  Yes, there is a bit of a theme here.

In the programming world, semantically significant naming is generally considered abominable, as it usually leads to fairly horrific surprises/failures (e.g., I've seen systems where choice of database table names have magic effects in code generators.) In my not-humble-at-all opinion, it's better (far better) to be upfront about your semantics and state them explicitly, rather than sneaking them in the back door via variable names and whatnot.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on December 5, 2021, 5:44 pm
Per the Information Principle, everything in the database is values in tables.

I suspected I would not get away with missing out the "in the database" part of that principle. Still, I've always assumed that what is outside of the database is of minimal interest (and anything that is of importance should dam well be brought into the database :-) )

That doesn't preclude having expressions that compute values to put in tables, nor does it preclude -- and it probably should be the case, simply for ergonomics' sake -- that even if all you have are relations, if you want to calculate something that a traditional language would handle with a traditional scalar expression, then you probably want it to at least look like a traditional scalar expression.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions".  I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

And if that's the case, you probably want to provide something that at least looks like type safety, so that the system at least warns you if you attempt to -- for example -- inadvertently (or even intentionally, I suppose) divide today's date by an invoice number.

Agreed

In the programming world, semantically significant naming is generally considered abominable, as it usually leads to fairly horrific surprises/failures (e.g., I've seen systems where choice of database table names have magic effects in code generators.)

I think I've been guilty of that once (or maybe even twice...)

In my not-humble-at-all opinion, it's better (far better) to be upfront about your semantics and state them explicitly, rather than sneaking them in the back door via variable names and whatnot.

Certainly I'd not want to be sneaky about them. Certainly I'd want to be upfront. Food for thought.

Quote from Paul Vernon on December 5, 2021, 3:24 pm

Still, my point was more the question if the codes should be part of a type, and if so, would the literals be just strings (i.e. currency code would be a sub-type of the set of strings), or should the be 'new' literals. ¤FRF say (or, but I sort of think this is cheating because it looks like a function, not a literal to me:  CURRENCY_CODE('FRF') )

The ISO call them "alphabetic codes" as far as I can see..  so no real indication if they consider them "just" strings, or "something else" that is not strings.

The best I can answer here is that many people seem to overlook that when it comes to representation, the fundamental type is STRING (or CHAR or TEXT or how you want to name it).  And when I say 'fundamental' I really mean 'fundamental' as in how the numbers 0 and 1 are fundamental to computing as we know it.  The string value STRING(FRF) is what we ***have to*** use whenever we want to convey something about the currency labeled 'FRF', which is the currency CURRENCY(STRING(FRF)).  Strings are the only things that can be "self-descriptive" in a sense, all the others require strings to become even simply denotable.  (What I mean is STRING values are the only values that do not [have to ?] depend on values of other types to become writable-down.  But all the others do, because there's no way for me, say, to refer to the number 17 without writing the STRING "17".)  When it comes to ***representation***, the most fundamental type of all is STRING.  I always get the feeling people have missed something when discussions like these (yours) pop up.  I mean I often feel they've missed the fact that they're confusing/conflating "bare essence" with "representation" and it is exactly this confusion/conflation that makes them wonder whether "currency codes should be a sub-type of STRING" (the set of ***string representations of currency identifiers*** is indeed a subset of the set of possible strings, but that does not make the set of currency identifiers itself so).

Quote from Paul Vernon on December 5, 2021, 8:27 pm
Quote from Dave Voorhis on December 5, 2021, 5:44 pm

In the programming world, semantically significant naming is generally considered abominable, as it usually leads to fairly horrific surprises/failures (e.g., I've seen systems where choice of database table names have magic effects in code generators.)

I think I've been guilty of that once (or maybe even twice...)

In my not-humble-at-all opinion, it's better (far better) to be upfront about your semantics and state them explicitly, rather than sneaking them in the back door via variable names and whatnot.

Certainly I'd not want to be sneaky about them. Certainly I'd want to be upfront. Food for thought.

My position is that the best languages can do is leave it open and free, but also that programmers should do the best they can to make whatever they write as self-documenting as reasonably possible.  In my present employment, all tables are ***named*** as the concatenation of the identifier of a business domain (say, "ADGZ") and a meaningless number (say, "0010").  Attributes are ***named*** with the first six positions of the name repeating the business domain and the meaningless table number (2 positions usable instead of 4, so no doubt you see where this is headed if the table number is 0130), and then followed by a "meaningless" attribute number, except that ordering of the attribute numbers used for a particular table should already reflect the optimal ordering of the columns in SQL, meaning choices at the logical level should be influenced by properties at the physical level.  Field names in records in flatfiles are then supposed to ***follow the same naming convention***, because that "makes it easier" to track what is happening to any given column of the database throughout the whole application.  So I ***never*** get to see code like

AM_VAT := AM_BASE * PCT_VAT

but always

ADGZ999901 := ADGZ1020 * PARM5005

(The number '9999' here is used to indicate "non-table field" as you might perhaps have guessed.)  You tell me which kind of approach you'd be more inclined to have to maintain.

(PS and if attribute names are chosen ***by the database designers*** then at least in that portion of the working space the programmers aren't left with any option but to be exactly as self-documenting as the database designers have chosen to be.)

Quote from Paul Vernon on December 5, 2021, 8:27 pm
Quote from Dave Voorhis on December 5, 2021, 5:44 pm
Per the Information Principle, everything in the database is values in tables.

I suspected I would not get away with missing out the "in the database" part of that principle. Still, I've always assumed that what is outside of the database is of minimal interest (and anything that is of importance should dam well be brought into the database :-) )

That doesn't preclude having expressions that compute values to put in tables, nor does it preclude -- and it probably should be the case, simply for ergonomics' sake -- that even if all you have are relations, if you want to calculate something that a traditional language would handle with a traditional scalar expression, then you probably want it to at least look like a traditional scalar expression.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions".  I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

Even if you conceptually define your language to be everything-is-a-relation (akin to Lisp's everything-is-a-list and Smalltalk's everything-is-an-object), if you're implementing something that looks like scalar expressions, you'll probably want to optimise away any visible relations underpinning scalar expressions and implement them as scalar expressions. Any relational view of them -- whether in whole or some components thereof -- can be derived from the scalar expression implementation mechanisms rather than underpin them.

That way, from a language user's point of view it can look like relations all the way down, even if -- for performance and ease of implementation -- it isn't.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Dave Voorhis on December 6, 2021, 12:11 am
Quote from Paul Vernon on December 5, 2021, 8:27 pm
Quote from Dave Voorhis on December 5, 2021, 5:44 pm
Per the Information Principle, everything in the database is values in tables.

I suspected I would not get away with missing out the "in the database" part of that principle. Still, I've always assumed that what is outside of the database is of minimal interest (and anything that is of importance should dam well be brought into the database :-) )

That doesn't preclude having expressions that compute values to put in tables, nor does it preclude -- and it probably should be the case, simply for ergonomics' sake -- that even if all you have are relations, if you want to calculate something that a traditional language would handle with a traditional scalar expression, then you probably want it to at least look like a traditional scalar expression.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions".  I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

Even if you conceptually define your language to be everything-is-a-relation (akin to Lisp's everything-is-a-list and Smalltalk's everything-is-an-object), if you're implementing something that looks like scalar expressions, you'll probably want to optimise away any visible relations underpinning scalar expressions and implement them as scalar expressions. Any relational view of them -- whether in whole or some components thereof -- can be derived from the scalar expression implementation mechanisms rather than underpin them.

That way, from a language user's point of view it can look like relations all the way down, even if -- for performance and ease of implementation -- it isn't.

A misunderstanding I think. Every language I know has two kinds of things, or more.

  • Lisp has lists and atoms. Any atom can be quoted and then its literal value can be assigned to a symbol (setq). Operations are composed as lists with a head and a tail of arguments.
  • Smalltalk has objects and selectors. Operations are composed as messages sent to objects and consist of a selector and arguments (objects).
  • Forth has words and primitive values. Words are operations on the stack.
  • The ERA can be defined purely in terms of relations and scalar values, very much like Lisp. Operations are represented as relcons.

In no case is a type system a requirement, but it helps. Operators know what they need; a type system just helps to avoid errors.

Andl - A New Database Language - andl.org
Quote from dandl on December 6, 2021, 1:16 am
Quote from Dave Voorhis on December 6, 2021, 12:11 am
Quote from Paul Vernon on December 5, 2021, 8:27 pm
Quote from Dave Voorhis on December 5, 2021, 5:44 pm
Per the Information Principle, everything in the database is values in tables.

I suspected I would not get away with missing out the "in the database" part of that principle. Still, I've always assumed that what is outside of the database is of minimal interest (and anything that is of importance should dam well be brought into the database :-) )

That doesn't preclude having expressions that compute values to put in tables, nor does it preclude -- and it probably should be the case, simply for ergonomics' sake -- that even if all you have are relations, if you want to calculate something that a traditional language would handle with a traditional scalar expression, then you probably want it to at least look like a traditional scalar expression.

Indeed that is the trick: to get a syntax that is actually just manipulating relations, but that looks like (as far as you can get) "traditional scalar expressions".  I have some ideas, I might even have got somewhere, but I'd certainly be interested in any pointers of anything like this trick out in the wild

Even if you conceptually define your language to be everything-is-a-relation (akin to Lisp's everything-is-a-list and Smalltalk's everything-is-an-object), if you're implementing something that looks like scalar expressions, you'll probably want to optimise away any visible relations underpinning scalar expressions and implement them as scalar expressions. Any relational view of them -- whether in whole or some components thereof -- can be derived from the scalar expression implementation mechanisms rather than underpin them.

That way, from a language user's point of view it can look like relations all the way down, even if -- for performance and ease of implementation -- it isn't.

A misunderstanding I think. Every language I know has two kinds of things, or more.

It's a riff on the classic and intentionally tongue-in-cheek "everything is a ..." programming meme(s) about guiding or defining principles, nicely summarised at https://wiki.c2.com/?EverythingIsa

It presumably originates from the Pythagorean notion that everything is a number (i.e., everything is mathematical) though everything in a computer is a number. Anything else depends on perspective (and hardware.)

  • Lisp has lists and atoms. Any atom can be quoted and then its literal value can be assigned to a symbol (setq). Operations are composed as lists with a head and a tail of arguments.
  • Smalltalk has objects and selectors. Operations are composed as messages sent to objects and consist of a selector and arguments (objects).
  • Forth has words and primitive values. Words are operations on the stack.
  • The ERA can be defined purely in terms of relations and scalar values, very much like Lisp. Operations are represented as relcons.

In no case is a type system a requirement, but it helps. Operators know what they need; a type system just helps to avoid errors.

Lisp and Smalltalk have type systems. Only Forth and BCPL and most machine/assembly languages do not -- or more accurately, their type systems are primitive. Values are opaque -- i.e., carry no associated type information -- so it's up to the language user to keep track of the types of values.

TCL's "everything is a string" is closer in spirit to "operators know what they need" than Forth (and BCPL, etc.)  It's "stringly typed" by design.

Forth and BCPL and most machine/assembly languages are more like "operators expect specific types; it's up to you to provide them."

Perhaps it would be more accurate to say type checking helps to avoid errors, as there is always a type system even if it is characterised by the absence of any type semantics in the language except for what operators expect, i.e., choice and correctness of operator invocations is left entirely to the language user.

This has already been covered elsewhere in the recent "Which ... ?" threads.

Anyway, my point is simply that implementing scalar operations from relation constructs is almost certainly the hard and inefficient way to do it. There are standard ways of implementing scalar expression evaluation. It probably makes more sense -- at least for a practical implementation; an experimental exploration with no concern for efficiency might well do otherwise -- to implement scalar expressions that way and derive relations from them as (or if) appropriate, than construct an expression evaluator from relation primitives.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org
Quote from Erwin on December 5, 2021, 8:29 pm

The best I can answer here is that many people seem to overlook that when it comes to representation, the fundamental type is STRING  ...

I always get the feeling people have missed something when discussions like these (yours) pop up.  I mean I often feel they've missed the fact that they're confusing/conflating "bare essence" with "representation" ...

Great post Erwin. Thank you.

In my defence I might say that I've not missed the fact that I'm conflating "bare essence" with "representation". I've known (to a greater or lesser extent) that that is what I have been (Devil's) advocating.

The point about strings is also very interesting. We would need to assume we are using a typical IDE/text editor to write our representations, and hence disallow (for the present discussion) bitmap images, coloured text and anything else that is not (just) a (valid) sequence of Unicode code points. But I'm fine with that assumption (I think). The thought then is, if strings of 1 or more Unicode code points are indeed the most fundamental type, then does not the concept of a string rather fade away? Everything is a string as per Tcl at https://wiki.c2.com/?EverythingIsa

Maybe  what I am really asking is, if we should even expect our model to be able to "get at" the bare essences. Maybe all we have are representations -  all we have are strings.

So that - as far as the system/model is concerned - 2 is a string and 2 is a number and 2 might be the name of a movie.  Numbers in this view are a subset of the set of strings. The reason we know more about what numbers are is that we have relations that define operations such as SUCCESSOR on them (well, some numbers anyway). But - as far as the system is concerned - the string 2 is the number 2. The bare essence (whatever that might be - and maybe in the case of 2 it is the set of all sets of two things) is at best isomorphic to the representation in our model. That might be the closest we can get to "bare essence" in (any?) model.

I"m happy to let this thread peter out (I think), but I've a new one in mind that is hopefully more concrete and less philosophical (and less confused?) ...  stay tuned
Quote from Erwin on December 5, 2021, 8:29 pm
Quote from Paul Vernon on December 5, 2021, 3:24 pm

Still, my point was more the question if the codes should be part of a type, and if so, would the literals be just strings (i.e. currency code would be a sub-type of the set of strings), or should the be 'new' literals. ¤FRF say (or, but I sort of think this is cheating because it looks like a function, not a literal to me:  CURRENCY_CODE('FRF') )

The ISO call them "alphabetic codes" as far as I can see..  so no real indication if they consider them "just" strings, or "something else" that is not strings.

The best I can answer here is that many people seem to overlook that when it comes to representation, the fundamental type is STRING (or CHAR or TEXT or how you want to name it).

That's the fundamental representation type of literals.

The fundamental representation type of everything in computing is array of bytes, er, array of bits... er, transistors being on or off. :-)

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

A misunderstanding I think. Every language I know has two kinds of things, or more.

It's a riff on the classic and intentionally tongue-in-cheek "everything is a ..." programming meme(s) about guiding or defining principles, nicely summarised at https://wiki.c2.com/?EverythingIsa

It presumably originates from the Pythagorean notion that everything is a number (i.e., everything is mathematical) though everything in a computer is a number. Anything else depends on perspective (and hardware.)

I have a real problem with this. There is absolutely no reason whatsoever to connect a theory of computation with numbers. There are no numbers in the Turing Machine, just symbols on a tape. It does the issue a great disservice to keep going back to implementation trivia such as numbers and bits.

  • Lisp has lists and atoms. Any atom can be quoted and then its literal value can be assigned to a symbol (setq). Operations are composed as lists with a head and a tail of arguments.
  • Smalltalk has objects and selectors. Operations are composed as messages sent to objects and consist of a selector and arguments (objects).
  • Forth has words and primitive values. Words are operations on the stack.
  • The ERA can be defined purely in terms of relations and scalar values, very much like Lisp. Operations are represented as relcons.

In no case is a type system a requirement, but it helps. Operators know what they need; a type system just helps to avoid errors.

Lisp and Smalltalk have type systems. Only Forth and BCPL and most machine/assembly languages do not -- or more accurately, their type systems are primitive. Values are opaque -- i.e., carry no associated type information -- so it's up to the language user to keep track of the types of values.

I said it's not needed, and I stand by that. The original Lisp had no type system (I know, because I wrote programs in it). Dynamic and static types systems have been grafted on later.

Smalltalk has only objects and selectors, no types. Some of those objects can be made to behave as if they were dynamically typed, but any object can respond to any message. See also https://stackoverflow.com/questions/20714259/smalltalk-type-system#20714495.

TCL's "everything is a string" is closer in spirit to "operators know what they need" than Forth (and BCPL, etc.)  It's "stringly typed" by design.

Forth and BCPL and most machine/assembly languages are more like "operators expect specific types; it's up to you to provide them."

Perhaps it would be more accurate to say type checking helps to avoid errors, as there is always a type system even if it is characterised by the absence of any type semantics in the language except for what operators expect, i.e., choice and correctness of operator invocations is left entirely to the language user.

This has already been covered elsewhere in the recent "Which ... ?" threads.

No, there is not always a type system. A Turing Machine has no type system,  just symbols on a tape. The virtual machines underlying Lisp and Smalltalk (and arguably Forth) have no values and no type system, whereas the VMs for UCSD Pascal, Java, C# and my own Powerflex all provide values and a type system to go with them.

Anyway, my point is simply that implementing scalar operations from relation constructs is almost certainly the hard and inefficient way to do it. There are standard ways of implementing scalar expression evaluation. It probably makes more sense -- at least for a practical implementation; an experimental exploration with no concern for efficiency might well do otherwise -- to implement scalar expressions that way and derive relations from them as (or if) appropriate, than construct an expression evaluator from relation primitives.

Nobody proposed using relations as a substitute for scalar values. The RM is built out of exactly two things: relations (with a heading) and attribute values arranged in tuples that conform to the heading. TTM grafted on a type system in order to create a familiar programming language, but a 'fourth manifesto' could take an entirely different tack. No type system required.

Andl - A New Database Language - andl.org
PreviousPage 4 of 7Next