Which type?

#31 · November 11, 2021, 2:47 pm

Quote from Paul Vernon on November 11, 2021, 2:41 pm

Right, so a type then is a (typically) named set of values that is (commonly) used within constraint expressions (or other "safety mechanisms or rules").

A type is not necessarily named. It would perhaps be more accurate to describe a type (at least in TTM contexts, and perhaps some distance beyond) as an identifiable set of values or a specified set of values, or that a type denotes a specific set of values.

Then, a type is commonly used to implement constraints on (sets of) values and define what operations may be performed on (sets of) values.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#32 · November 11, 2021, 3:42 pm

I would join those two statements. A type identifies, specifies, denotes or names a specific set of values which is intended to be used to implement constraints on (sets of) values and/or define what operations on (sets of) values will return non-empty results.

I.e. if you identify a set of values, but don't then go on to use them as part of your "type system" of constraints/safety/correctness then you should not call that set a type.

I would probably also add that such sets of values should typically be constant over time, or at least into the near future, with any changes best planned in advance and future dated.

In practice, I guess you just have a relation that lists all your type names or type sets for unnamed types. You possibly would not constrain that relation but simply "trust" users to only nominate "genuine" types. I mean you could constrain it to enforce that each type is the range of at least one function/operator input/output, and/or is explicitly referenced in at least one constraint. Not sure if such a constraint would be helpful or not.

Which all rather parallels the situation of functions. A function is just a named set of values (well, most always a relation of degree 2 or more). Why do we need the word function if that is all it is? Again, the meaning emerges from its intended use. The relation < can be considered a function because we use it in expressions.

#33 · November 11, 2021, 4:20 pm

I guess we could reserve the word type for sets used in constraints, and the word domain for sets that are the domain of functions, (and codomain, for sets that are the or codomain of functions). Often a set would be both or all three, but maybe a useful set of sets would not be.

#34 · November 11, 2021, 4:31 pm

Quote from Dave Voorhis on November 11, 2021, 4:31 pm

Quote from Paul Vernon on November 11, 2021, 3:42 pm

I would join those two statements. A type identifies, specifies, denotes or names a specific set of values which is intended to be used to implement constraints on (sets of) values and/or define what operations on (sets of) values will return non-empty results.

A type is still a type whether you use it or not. An empty type is also a respectable type, but not all type systems necessarily have one.

I.e. if you identify a set of values, but don't then go on to use them as part of your "type system" of constraints/safety/correctness then you should not call that set a type.

But it might be a type whether you use it or not. If I create a type -- say, "decimal" -- in a collection of types in a library intended for other users, but I don't use it, it's still a type.

I would probably also add that such sets of values should typically be constant over time, or at least into the near future, with any changes best planned in advance and future dated.

Types are generally invariant. E.g., "Integer" isn't going to change (unless there's a mathematical revolution) but "customers_we_billed_this_month" will change, and is generally not considered a type in and of itself. But then you might create a type to represent it as an abstraction, so that you can't -- for example -- inadvertently UNION it with (say) "customers_we_billed_last_month", etc.

In practice, I guess you just have a relation that lists all your type names or type sets for unnamed types. You possibly would not constrain that relation but simply "trust" users to only nominate "genuine" types. I mean you could constrain it to enforce that each type is the range of at least one function/operator input/output, and/or is explicitly referenced in at least one constraint. Not sure if such a constraint would be helpful or not.

Which all rather parallels the situation of functions. A function is just a named set of values (well, most always a relation of degree 2 or more). Why do we need the word function if that is all it is? Again, the meaning emerges from its intended use. The relation < can be considered a function because we use it in expressions.

A function is a type of relation in which there is one and only one return value for each argument value.

Quote from Paul Vernon on November 11, 2021, 3:42 pm

I would join those two statements. A type identifies, specifies, denotes or names a specific set of values which is intended to be used to implement constraints on (sets of) values and/or define what operations on (sets of) values will return non-empty results.

A type is still a type whether you use it or not. An empty type is also a respectable type, but not all type systems necessarily have one.

I.e. if you identify a set of values, but don't then go on to use them as part of your "type system" of constraints/safety/correctness then you should not call that set a type.

But it might be a type whether you use it or not. If I create a type -- say, "decimal" -- in a collection of types in a library intended for other users, but I don't use it, it's still a type.

I would probably also add that such sets of values should typically be constant over time, or at least into the near future, with any changes best planned in advance and future dated.

Types are generally invariant. E.g., "Integer" isn't going to change (unless there's a mathematical revolution) but "customers_we_billed_this_month" will change, and is generally not considered a type in and of itself. But then you might create a type to represent it as an abstraction, so that you can't -- for example -- inadvertently UNION it with (say) "customers_we_billed_last_month", etc.

In practice, I guess you just have a relation that lists all your type names or type sets for unnamed types. You possibly would not constrain that relation but simply "trust" users to only nominate "genuine" types. I mean you could constrain it to enforce that each type is the range of at least one function/operator input/output, and/or is explicitly referenced in at least one constraint. Not sure if such a constraint would be helpful or not.

Which all rather parallels the situation of functions. A function is just a named set of values (well, most always a relation of degree 2 or more). Why do we need the word function if that is all it is? Again, the meaning emerges from its intended use. The relation < can be considered a function because we use it in expressions.

A function is a type of relation in which there is one and only one return value for each argument value.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#35 · November 11, 2021, 5:07 pm

A function is a type of relation in which there is one and only one return value for each argument value.

Is there a good word for e.g. √ that can return more than one value for each argument?

I like to use the word operator for operations on sets - so UNION etc, and function for operations on scalars + etc, but that leaves square root and similar rather in the lurch.

Wikipedia says this for operator

There is no general definition of an operator, but the term is often used in place of function when the domain is a set of functions or other structured objects.

but offers no collective word for https://en.wikipedia.org/wiki/Square_root

#36 · November 11, 2021, 5:14 pm

Quote from Paul Vernon on November 11, 2021, 5:07 pm

A function is a type of relation in which there is one and only one return value for each argument value.

Is there a good word for e.g. √ that can return more than one value for each argument?

Relation.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#37 · November 11, 2021, 5:17 pm

😀

#38 · November 11, 2021, 11:59 pm

Specifically in Algebra A and in my formalism of the ERA: a relcon. Most relcons are simple functions returning a single value for one or more arguments multiple return values are not a problem.

Andl - A New Database Language - andl.org

#39 · November 12, 2021, 12:17 am

Quote from dandl on November 12, 2021, 12:17 am

Quote from Dave Voorhis on November 11, 2021, 2:04 pm

Quote from dandl on November 11, 2021, 11:58 am

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

I presume "as native" allows them to be defined as part of a standard library?

The C language does almost nothing without the inclusion of a standard library. That's language design issue, not germane to this question.

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

The same applies to decimal and datetime -- struct for the former and (usually) an integer or two for the latter.

If you take that line of thinking, binary blob does it all. I don't. The aim is a set that is sufficient rather than necessary or minimal, and that aligns with the data types found in other data models. If the nine types are supported, everything else can be readily accommodated by sets of operators on those types.

[As an aside, struct is a poor choice for decimal because the struct members have no useful names. Integer is a poor choice for date/time because (a) you also need an epoch (b) it sets arbitrary limits on precision.]

There are plenty of conceivable types, of course -- temperature, currency (not quite the same as decimal), distance, vector, etc., plus every domain-specific type to enforce type safety, so invoice_number, customer_number, and so on ad infinitum. Of course, you might suggest that decimal is somehow more typeful than invoice_number, but why, if both are based on integer?

Thus, I'd argue that the measure of any language -- database or otherwise -- is in its ability to define types, not in whether (or how) it embeds (or doesn't) some canonical set of them.

And that is the point: types of the kinds you mention are required by programming languages but are not inherently needed in the data model. Start with the bedrock of the 9 types and every data model is supported, at least at the storage level. Add libraries of operators as you add programming types, but as artefacts of the language through which the data model is viewed, not part of the model itself.

This is an inherent problem with TTM and not with SQL. The intention is a language-independent base view on the data, and language dependent views above that.

Quote from Dave Voorhis on November 11, 2021, 2:04 pm

Quote from dandl on November 11, 2021, 11:58 am

Quote from Dave Voorhis on November 11, 2021, 8:54 am

Quote from dandl on November 11, 2021, 12:48 am

[...]

In all the database models I can ever remember seeing, I have never found the need for more than 9 scalar types: boolean, integer, real, decimal, datetime, text string, binary string, enum, struct. (And I regard value inheritance as a pointless thought experiment.)

A programming language may have many more types, but IMO it should support those 9 as scalar value types if it is to be a database programming language (which is the aim of TTM/D). I don't know any that do.

I don't know any popular programming languages that don't support those as user-defined types in a library.

That's the crucial requirement, really -- not that those 9 (give or take) types be baked into the language, but that the language allow the user to define them, along with any other conceivable types that may be deemed desirable, and have them treated as notionally equivalent to built-in-to-the-language types, modulo the usual primitive vs non-primitive type issues.

The point is: "should support those 9 as scalar value types if it is to be a database programming language". It's not intended to be a high bar, but the languages I know don't have all these types natively, and encounter various restrictions when adding them using libraries. A D candidate (or SQL replacement) should support them 'as native', smoothly and seamlessly.

I presume "as native" allows them to be defined as part of a standard library?

The C language does almost nothing without the inclusion of a standard library. That's language design issue, not germane to this question.

There are no other 'conceivable' types that I know of. Remember: this is just about the storage/data model. There are lots of other interesting types for writing programs, but then as per OO VSS 2, the operators on those types are not part of the type itself. So for example, you can store a Point or a Complex value as a struct of two elements, but they are of database type struct. The programming library that provides operators on Point and Complex are provided by some programming language rather than part of the database itself.

The same applies to decimal and datetime -- struct for the former and (usually) an integer or two for the latter.

If you take that line of thinking, binary blob does it all. I don't. The aim is a set that is sufficient rather than necessary or minimal, and that aligns with the data types found in other data models. If the nine types are supported, everything else can be readily accommodated by sets of operators on those types.

[As an aside, struct is a poor choice for decimal because the struct members have no useful names. Integer is a poor choice for date/time because (a) you also need an epoch (b) it sets arbitrary limits on precision.]

There are plenty of conceivable types, of course -- temperature, currency (not quite the same as decimal), distance, vector, etc., plus every domain-specific type to enforce type safety, so invoice_number, customer_number, and so on ad infinitum. Of course, you might suggest that decimal is somehow more typeful than invoice_number, but why, if both are based on integer?

Thus, I'd argue that the measure of any language -- database or otherwise -- is in its ability to define types, not in whether (or how) it embeds (or doesn't) some canonical set of them.

And that is the point: types of the kinds you mention are required by programming languages but are not inherently needed in the data model. Start with the bedrock of the 9 types and every data model is supported, at least at the storage level. Add libraries of operators as you add programming types, but as artefacts of the language through which the data model is viewed, not part of the model itself.

This is an inherent problem with TTM and not with SQL. The intention is a language-independent base view on the data, and language dependent views above that.

Andl - A New Database Language - andl.org

#40 · November 12, 2021, 12:18 am

Quote from Paul Vernon on November 11, 2021, 5:07 pm

A function is a type of relation in which there is one and only one return value for each argument value.

Is there a good word for e.g. √ that can return more than one value for each argument?

In a programming language, a function/operator can return only one result. That result might be a pair or data structure with identifiable fields or components such that we might say loosely "more than one value". From the point of view of programming language semantics, that's one result at one type.

One advantage of a type system is that it disciplines thought, rather than talking loosely or arm-waving.

... Having said that, and earlier having said that I'm no mathematician, I would also consider myself "hardly a programmer".

Yes that was a conclusion I was coming to already.

And yet you want to vaunt a bunch of ill-informed prejudices about type systems. Have you come here for a 5-minute argument, or would you like to take a course of ten at a discount?

I like to use the word operator for operations on sets - so UNION etc, and function for operations on scalars + etc, but that leaves square root and similar rather in the lurch.

Wikipedia says this for operator

Wikipedia for Operator (mathematics) says this, which you should have read first, and is much more to the point:

The definition you want is Operator (computer programming).

The Forum for Discussion about The Third Manifesto and Related Matters