Which type?

#21 · November 11, 2021, 11:58 am

Quote from dandl on November 11, 2021, 11:58 am

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

Andl - A New Database Language - andl.org

#22 · November 11, 2021, 12:18 pm

Quote from Paul Vernon on November 11, 2021, 10:11 am

If the only scalar 'type' is a binary string and the only operation on binary strings is a test for equality, then you have got two sets - the set of result values from equality (presumably binary string 0 and binary string 1), and the set of all binary strings (i.e. the set of possible inputs to the equality test). That would mean you have two types right? One you might call, oh say Boolean and the other say Everything. Hence even in such a limited system, you have a "type system".

Not so. There is no innate requirement that the result of a test for equality is ever made visible as a value, it might just be a low-level JEQ opcode. With no value there is no type.

Alternatively, the convention is that an empty bit string is False and any other bit string is True. Still no boolean type.

If it is not really possible to not have a "type system", then I am left unsure about the usefulness of the concept of "type system".

If the presence of a "type system" is directly implied from an axiom of atoms (i.e the existence of scalars) and an of axiom sets (i.e. the existence of collections). Hence the existence of two sets of things - a set of scalars and a set of sets. Again, I am left unsure about the usefulness of the concept of "type system".

Not that that worries me. I don't find the concept very useful anyway. Very happy to dispense with it! :-)

Type systems are foundational in programming languages. 1950s Fortran as I recall had INTEGER and REAL (but no strings). They are not foundational in a data model or in the RM. The big assumption of TTM was the idea of imposing a programming language type system on a data model.

Andl - A New Database Language - andl.org

#23 · November 11, 2021, 12:20 pm

(replying to post #20)

You have different kinds of constructs -- such as values -- and there is a notion of which of them are "right" or "wrong" in some context. Ergo, type system.

For notions of "right" and "wrong" values, I prefer the term "safety" or "correctness" or "robustness", not "type system".

I think that it is easier/better/more powerful to construct a safety system and to prove correctness of expressions in ways other than via axiomatic types. Constraints from the relational model, and assertions from programming being the "obvious" better way.

If I want to say that only even numbers are the right input to (or output from) an expression, why should I have to go create an even type first?

What end is there to the multitude of types I would need to create?

I just don't see the need for them. Sets of values? Sure we need them, but do I really have to point at each one and say, that is a type, that is not a type, that one is a type, that one is not...

Take my even example. Is the set of even numbers a type or not? If so, why? If not, why?

#24 · November 11, 2021, 12:50 pm

Quote from AntC on November 11, 2021, 11:12 am

I should have asked this earlier: which programming languages do you know/what type systems do they use?

I will mention that I took a look at Idris this year. I was intrigued by the premise of "Type-driven development" and wanted a functional language to look into (I was always quite taken with Miranda at university.). That, I always loved Ivor the Engine. :-)

Having said that, and earlier having said that I'm no mathematician, I would also consider myself "hardly a programmer". I can code, but (luckily I think) I know I'm not all that good at it, and don't code for the pleasure of it. Writing expressions (yes, that would mean SQL for most of my career 😞) is more my thing (and I am not all that sure how good I am a that either ...).

So Idris was interesting as far as I got. I like the premise, I.e. some answer to the "search for ways to improve the robustness and safety of software.", however I did not come away agreeing with Brady that "expressing a program’s intention in its type" is the best way to do it.

#25 · November 11, 2021, 12:55 pm

Quote from dandl on November 11, 2021, 12:18 pm

The big assumption of TTM was the idea of imposing a programming language type system on a data model.

Indeed. An assumption. And (unfortunately) a harmful one I have come to believe. Not that the intent was wrong, more that "type systems" are the wrong answer to the right question.

"Catching a significant number of mistakes"and "Improving the robustness and safety of software" are the right questions. We just need the right answer.

#26 · November 11, 2021, 1:49 pm

Quote from Dave Voorhis on November 11, 2021, 1:49 pm

Quote from Paul Vernon on November 11, 2021, 12:20 pm

(replying to post #20)

You have different kinds of constructs -- such as values -- and there is a notion of which of them are "right" or "wrong" in some context. Ergo, type system.

For notions of "right" and "wrong" values, I prefer the term "safety" or "correctness" or "robustness", not "type system".

The notions of "right" and "wrong" values -- and using that to implement "safety", "correctness", and "robustness" (by which I take it to mean the ability to avoid run-time errors caused by "wrong" values by catching them at compile-time or run-time rather than crashing) -- are characteristics of type systems.

You might decide to call it a "value constraint system" instead of "type system", but by most interpretations, it would be a type system.

I think that it is easier/better/more powerful to construct a safety system and to prove correctness of expressions in ways other than via axiomatic types. Constraints from the relational model, and assertions from programming being the "obvious" better way.

If I want to say that only even numbers are the right input to (or output from) an expression, why should I have to go create an even type first?

There's nothing that obligates a type system to require manifest type definitions up-front. If your expression has pre-conditions on evaluation such that it will only evaluate if certain conditions hold true on the values of certain terms -- a sort of predicate dispatch, I suppose -- that would be a characteristic of your type system.

What end is there to the multitude of types I would need to create?

In most popular programming languages, you either use built-in types (baked into the language or a standard library) or define those that you need. Multitudes are rarely needed, though types can be parametric or otherwise dynamic, allowing (say) n implicit types to be derived from a single explicit definition.

I just don't see the need for them. Sets of values? Sure we need them, but do I really have to point at each one and say, that is a type, that is not a type, that one is a type, that one is not...

Take my even example. Is the set of even numbers a type or not? If so, why? If not, why?

The set of even numbers is a type.

Whether you feel the need for them or not, types will exist. Your language will inevitably have type system semantics -- unless the only value type is some completely opaque blob with no operations (or no visible values at all) -- whether you explicitly consider them or not.

Quote from Paul Vernon on November 11, 2021, 12:20 pm

(replying to post #20)

You have different kinds of constructs -- such as values -- and there is a notion of which of them are "right" or "wrong" in some context. Ergo, type system.

For notions of "right" and "wrong" values, I prefer the term "safety" or "correctness" or "robustness", not "type system".

The notions of "right" and "wrong" values -- and using that to implement "safety", "correctness", and "robustness" (by which I take it to mean the ability to avoid run-time errors caused by "wrong" values by catching them at compile-time or run-time rather than crashing) -- are characteristics of type systems.

You might decide to call it a "value constraint system" instead of "type system", but by most interpretations, it would be a type system.

I think that it is easier/better/more powerful to construct a safety system and to prove correctness of expressions in ways other than via axiomatic types. Constraints from the relational model, and assertions from programming being the "obvious" better way.

If I want to say that only even numbers are the right input to (or output from) an expression, why should I have to go create an even type first?

There's nothing that obligates a type system to require manifest type definitions up-front. If your expression has pre-conditions on evaluation such that it will only evaluate if certain conditions hold true on the values of certain terms -- a sort of predicate dispatch, I suppose -- that would be a characteristic of your type system.

What end is there to the multitude of types I would need to create?

In most popular programming languages, you either use built-in types (baked into the language or a standard library) or define those that you need. Multitudes are rarely needed, though types can be parametric or otherwise dynamic, allowing (say) n implicit types to be derived from a single explicit definition.

I just don't see the need for them. Sets of values? Sure we need them, but do I really have to point at each one and say, that is a type, that is not a type, that one is a type, that one is not...

Take my even example. Is the set of even numbers a type or not? If so, why? If not, why?

The set of even numbers is a type.

Whether you feel the need for them or not, types will exist. Your language will inevitably have type system semantics -- unless the only value type is some completely opaque blob with no operations (or no visible values at all) -- whether you explicitly consider them or not.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#27 · November 11, 2021, 2:04 pm

Quote from Dave Voorhis on November 11, 2021, 2:04 pm

Quote from dandl on November 11, 2021, 11:58 am

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

I presume "as native" allows them to be defined as part of a standard library?

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

The same applies to decimal and datetime -- struct for the former and (usually) an integer or two for the latter.

There are plenty of conceivable types, of course -- temperature, currency (not quite the same as decimal), distance, vector, etc., plus every domain-specific type to enforce type safety, so invoice_number, customer_number, and so on ad infinitum. Of course, you might suggest that decimal is somehow more typeful than invoice_number, but why, if both are based on integer?

Thus, I'd argue that the measure of any language -- database or otherwise -- is in its ability to define types, not in whether (or how) it embeds (or doesn't) some canonical set of them.

Quote from dandl on November 11, 2021, 11:58 am

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

I presume "as native" allows them to be defined as part of a standard library?

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

The same applies to decimal and datetime -- struct for the former and (usually) an integer or two for the latter.

There are plenty of conceivable types, of course -- temperature, currency (not quite the same as decimal), distance, vector, etc., plus every domain-specific type to enforce type safety, so invoice_number, customer_number, and so on ad infinitum. Of course, you might suggest that decimal is somehow more typeful than invoice_number, but why, if both are based on integer?

Thus, I'd argue that the measure of any language -- database or otherwise -- is in its ability to define types, not in whether (or how) it embeds (or doesn't) some canonical set of them.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#28 · November 11, 2021, 2:08 pm

Still I can determine no distinction between a set of values and a type.

Why use two words for the same thing?

> So what is a type? Essentially, it is a named, finite set of values.

[DTATRM]

*Type*: A collection of values. An estimate of the collection of values that a program fragment can assume during program execution.

The fundamental purpose of a type system is to prevent the occurrence of execution errors during the running of a program.

[TypeSystems](http://lucacardelli.name/Papers/TypeSystems.pdf)

If all sets of values are types, and types are just sets of values. What is the point of the word type?

(I really don't like having two representations for the same thing)

If types are the subset of all sets that "constitute an estimate of the collection of (sets of) values that a program fragment can assume during program execution". Then, OK, I can see a role for the word type. But then again, in general that set would be the set of everything. I.e. you can ask for any subset of values you like as the result of a database expression (well, TTM only allows relations, but that is a different point). You can create a set literal with any values you like (again, TTM would struggle with that, but again a different point). If type is the set of values that a "fragment can assume", that would in general be the powerset of the universal set.

If types are named sets of values, then OK - the set of sets that we can be bothered to name is a useful set. Still, I don't like the word type for that. *thing* might be better ?! (as in, "is that a thing" - i.e. "something" that is identifiable/recognisable as an interesting enough set to name).

#29 · November 11, 2021, 2:21 pm

Quote from Dave Voorhis on November 11, 2021, 2:21 pm

Quote from Paul Vernon on November 11, 2021, 2:08 pm

Still I can determine no distinction between a set of values and a type.

Why use two words for the same thing?

"Type" is four letters and one word. "Set of values" is thirteen letters and three words, so "type" is more ergonomic. :-)

A set of values is a type, but in a language a type also (typically) permits definition of rules for using that set of values, and mechanisms for treating values of that type as an atomic construct.

> So what is a type? Essentially, it is a named, finite set of values.

[DTATRM]

> *Type*: A collection of values. An estimate of the collection of values that a program fragment can assume during program execution.

> The fundamental purpose of a type system is to prevent the occurrence of execution errors during the running of a program.

[TypeSystems](http://lucacardelli.name/Papers/TypeSystems.pdf)

If all sets of values are types, and types are just sets of values. What is the point of the word type?

(I really don't like having two representations for the same thing)

If types are the subset of all sets that "constitute an estimate of the collection of (sets of) values that a program fragment can assume during program execution". Then, OK, I can see a role for the word type. But then again, in general that set would be the set of everything. I.e. you can ask for any subset of values you like as the result of a database expression (well, TTM only allows relations, but that is a different point). You can create a set literal with any values you like (again, TTM would struggle with that, but again a different point). If type is the set of values that a "fragment can assume", that would in general be the powerset of the universal set.

If types are named sets of values, then OK - the set of sets that we can be bothered to name is a useful set. Still, I don't like the word type for that. *thing* might be better ?! (as in, "is that a thing" - i.e. "something" that is identifiable/recognisable as an interesting enough set to name)

The problem is that "thing" is too broad. You could also call, say, an "if" statement or a variable a "thing", but neither (in the usual popular programming languages, at least) is or belongs to a type.

Quote from Paul Vernon on November 11, 2021, 2:08 pm

Still I can determine no distinction between a set of values and a type.

Why use two words for the same thing?

"Type" is four letters and one word. "Set of values" is thirteen letters and three words, so "type" is more ergonomic. :-)

A set of values is a type, but in a language a type also (typically) permits definition of rules for using that set of values, and mechanisms for treating values of that type as an atomic construct.

> So what is a type? Essentially, it is a named, finite set of values.

[DTATRM]

> *Type*: A collection of values. An estimate of the collection of values that a program fragment can assume during program execution.

> The fundamental purpose of a type system is to prevent the occurrence of execution errors during the running of a program.

[TypeSystems](http://lucacardelli.name/Papers/TypeSystems.pdf)

If all sets of values are types, and types are just sets of values. What is the point of the word type?

(I really don't like having two representations for the same thing)

If types are the subset of all sets that "constitute an estimate of the collection of (sets of) values that a program fragment can assume during program execution". Then, OK, I can see a role for the word type. But then again, in general that set would be the set of everything. I.e. you can ask for any subset of values you like as the result of a database expression (well, TTM only allows relations, but that is a different point). You can create a set literal with any values you like (again, TTM would struggle with that, but again a different point). If type is the set of values that a "fragment can assume", that would in general be the powerset of the universal set.

If types are named sets of values, then OK - the set of sets that we can be bothered to name is a useful set. Still, I don't like the word type for that. *thing* might be better ?! (as in, "is that a thing" - i.e. "something" that is identifiable/recognisable as an interesting enough set to name)

The problem is that "thing" is too broad. You could also call, say, an "if" statement or a variable a "thing", but neither (in the usual popular programming languages, at least) is or belongs to a type.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#30 · November 11, 2021, 2:41 pm

Right, so a type then is a (typically) named , (typically, but not exclusively) fixed, set of values that is (commonly) used within constraint expressions (or other "safety mechanisms or rules").

Now that is a definition of type that I could begin to like.

Something fuzzy, I am happy with. It helps to reinforce the point that the concept is not fundamental, not axiomatic, but more something that "emerges" with use.

Integer, yes a really good type. Positive integer ditto. Even, OK(ish). Integer between -2^63 and 2^63-1, yes, sure a great type when looking at performance optimisation and machine representations. Even Prime, bit rubbish, just use 2. The primes of the form 2^n - 1, well, OK if that's helpful. etc

And then a type system is, as discussed, just a synonym for a "value constraint system", or some such system of "safety". I.e. it is the whole of the constraints, and the sets of values used in those constraints. Again, I could begin to like such a definition

The Forum for Discussion about The Third Manifesto and Related Matters