# Should a list of values be a table or something else?

Quote from Brian S on October 27, 2018, 1:11 pmI'm confused. Why is an allowable exception required for that?

Perhaps I have a different understanding of what constitutes an enumerated type. I thought values of an enumerated type are distinct from all other values of all other types. How else would a system keep track of all of the types a particular value belongs to--especially over time. For instance, suppose that a value belongs to more than one enumerated type.

PRIMES_BELOW_20: {INT(2),INT(3),INT(5),INT(7),INT(11),INT(17),INT(19)}

EVEN_NUMBERS_BELOW_10: {INT(2),INT(4),INT(6),INT(8)}

The value INT(2) now belongs to types INT, PRIMES_BELOW_20 and EVEN_NUMBERS_BELOW_10.

Can a variable of type INT be compared to a variable of type PRIMES_BELOW_20? Does the system need to check each defined enumerated type for commonality every time it encounters a comparison? What happens if some of the values of an enumerated type belong to different base types.

UNITS_OF_MEASURE: {INT(1), CHAR("ONE"), CHAR("EACH"), INT(2), CHAR("TWO"), CHAR("PAIR"), ...}

Can a variable of type CHAR be compared with a variable of type UNITS_OF_MEASURE? Or a variable of type INT be compared to a variable of type UNITS_OF_MEASURE? Then why not allow variables of type CHAR to be compared to variables of type INT?

In my view, the values of an enumeration should be distinct from what instantiates the selector of the enumeration.

Brian

I'm confused. Why is an allowable exception required for that?

Perhaps I have a different understanding of what constitutes an enumerated type. I thought values of an enumerated type are distinct from all other values of all other types. How else would a system keep track of all of the types a particular value belongs to--especially over time. For instance, suppose that a value belongs to more than one enumerated type.

PRIMES_BELOW_20: {INT(2),INT(3),INT(5),INT(7),INT(11),INT(17),INT(19)}

EVEN_NUMBERS_BELOW_10: {INT(2),INT(4),INT(6),INT(8)}

The value INT(2) now belongs to types INT, PRIMES_BELOW_20 and EVEN_NUMBERS_BELOW_10.

Can a variable of type INT be compared to a variable of type PRIMES_BELOW_20? Does the system need to check each defined enumerated type for commonality every time it encounters a comparison? What happens if some of the values of an enumerated type belong to different base types.

UNITS_OF_MEASURE: {INT(1), CHAR("ONE"), CHAR("EACH"), INT(2), CHAR("TWO"), CHAR("PAIR"), ...}

Can a variable of type CHAR be compared with a variable of type UNITS_OF_MEASURE? Or a variable of type INT be compared to a variable of type UNITS_OF_MEASURE? Then why not allow variables of type CHAR to be compared to variables of type INT?

In my view, the values of an enumeration should be distinct from what instantiates the selector of the enumeration.

Brian

Quote from Erwin on October 28, 2018, 10:56 amIn reply to "but duck the question: what in particular would tip the balance towards using a table?"

If I had a fully compliant language and the system's UDT facility won't cause any L10N/I18N issues (or what issues it poses isn't too cumbersome to overcome) then I'm hard-pressed to come up with a reason not to take the enum type way.

(As an example of "L10N/I18N issues" : java's facility to parse a given String and return an enum value if the given string is equal to the "name"(/"label") of one of the enumerated values, is such one. It gives me the facility to parse "Yes" into yes, but not the facility to parse "Oui"/"Ja"/"Si"/"Sim"/"Da"/... into yes.)

In reply to "but duck the question: what in particular would tip the balance towards using a table?"

If I had a fully compliant language and the system's UDT facility won't cause any L10N/I18N issues (or what issues it poses isn't too cumbersome to overcome) then I'm hard-pressed to come up with a reason not to take the enum type way.

(As an example of "L10N/I18N issues" : java's facility to parse a given String and return an enum value if the given string is equal to the "name"(/"label") of one of the enumerated values, is such one. It gives me the facility to parse "Yes" into yes, but not the facility to parse "Oui"/"Ja"/"Si"/"Sim"/"Da"/... into yes.)

Quote from Hugh on October 28, 2018, 3:45 pmBrian asks why I would prefer to use a relation in certain cases, so here's an example from one of my hobby databases. (It's the one I mentioned in another discussion in this forum where I and a friend in Australia were (and still are) investigating a generalisation of the well known "four fours" problem in recreational mathematics.)

VAR Cases VIRTUAL with ( RI := rel { tup { i 1 } , tup { i 2 } , tup { i 3 } , tup { i 4 } , tup { i 5 } , tup { i 6 } , tup { i 7 } , tup { i 8 } , tup { i 9 } } , RJ := RI RENAME { i AS j } , RK := RI RENAME { i AS k } , RL := RI RENAME { i AS l } ) : ( ( ( ( ( RI JOIN RJ ) WHERE j >= i ) JOIN RK ) WHERE k >= j ) JOIN RL ) WHERE l >= k;

That gives me a set of 4-tuples with all of the 495 possible combinations of 1, 2, 3, 4, 5, 6, 7, 8, 9. (Note: combinations, not permutations.)

I have three real relvars with foreign keys referencing Cases and several "exclusion" constraints preventing a tuple from appearing in more than one of those real relvars. Okay, all that could be done with an enumerated type, but it would have been so easy for me to set up and I would have to use invocations of THE_ operators all of the place instead of simple attribute references i, j, k, and l in the queries I use to extract interesting results from this database.

How could I have defined Cases as a user-defined enumerated type instead? And would it be a scalar type or a tuple type? What problems could I have solved that way that I couldn't solve my way? In what ways might my life have been a bit easier with an enumerated type? (It would save having to declare foreign keys, for sure, but what else?)

Incidentally I also have

VAR N VIRTUAL RELATION { n INTEGER } { TUPLE { n 1 } , TUPLE { n 2 } , TUPLE { n 3 } , TUPLE { n 4 } , TUPLE { n 5 } , TUPLE { n 6 } , TUPLE { n 7 } , TUPLE { n 8 } , TUPLE { n 9 } , TUPLE { n 10 } , TUPLE { n 11 } , TUPLE { n 12 } , TUPLE { n 13 } , TUPLE { n 14 } , TUPLE { n 15 } , TUPLE { n 16 } , TUPLE { n 17 } , TUPLE { n 18 } , TUPLE { n 19 } , TUPLE { n 20 } , TUPLE { n 21 } , TUPLE { n 22 } , TUPLE { n 23 } , TUPLE { n 24 } , TUPLE { n 25 } , TUPLE { n 26 } , TUPLE { n 27 } , TUPLE { n 28 } , TUPLE { n 29 } , TUPLE { n 30 } , TUPLE { n 31 } , TUPLE { n 32 } , TUPLE { n 33 } , TUPLE { n 34 } , TUPLE { n 35 } , TUPLE { n 36 } , TUPLE { n 37 } , TUPLE { n 38 } , TUPLE { n 39 } , TUPLE { n 40 } , TUPLE { n 41 } , TUPLE { n 42 } , TUPLE { n 43 } , TUPLE { n 44 } , TUPLE { n 45 } , TUPLE { n 46 } , TUPLE { n 47 } , TUPLE { n 48 } , TUPLE { n 49 } , TUPLE { n 50 } , TUPLE { n 51 } , TUPLE { n 52 } , TUPLE { n 53 } , TUPLE { n 54 } , TUPLE { n 55 } , TUPLE { n 56 } , TUPLE { n 57 } , TUPLE { n 58 } , TUPLE { n 59 } , TUPLE { n 60 } , TUPLE { n 61 } , TUPLE { n 62 } , TUPLE { n 63 } , TUPLE { n 64 } , TUPLE { n 65 } , TUPLE { n 66 } , TUPLE { n 67 } , TUPLE { n 68 } , TUPLE { n 69 } , TUPLE { n 70 } , TUPLE { n 71 } , TUPLE { n 72 } , TUPLE { n 73 } , TUPLE { n 74 } , TUPLE { n 75 } , TUPLE { n 76 } , TUPLE { n 77 } , TUPLE { n 78 } , TUPLE { n 79 } , TUPLE { n 80 } , TUPLE { n 81 } , TUPLE { n 82 } , TUPLE { n 83 } , TUPLE { n 84 } , TUPLE { n 85 } , TUPLE { n 86 } , TUPLE { n 87 } , TUPLE { n 88 } , TUPLE { n 89 } , TUPLE { n 90 } , TUPLE { n 91 } , TUPLE { n 92 } , TUPLE { n 93 } , TUPLE { n 94 } , TUPLE { n 95 } , TUPLE { n 96 } , TUPLE { n 97 } , TUPLE { n 98 } , TUPLE { n 99 } , TUPLE { n 100 } };

and those three real relvars also have foreign keys referencing N.

Hugh

Brian asks why I would prefer to use a relation in certain cases, so here's an example from one of my hobby databases. (It's the one I mentioned in another discussion in this forum where I and a friend in Australia were (and still are) investigating a generalisation of the well known "four fours" problem in recreational mathematics.)

VAR Cases VIRTUAL with ( RI := rel { tup { i 1 } , tup { i 2 } , tup { i 3 } , tup { i 4 } , tup { i 5 } , tup { i 6 } , tup { i 7 } , tup { i 8 } , tup { i 9 } } , RJ := RI RENAME { i AS j } , RK := RI RENAME { i AS k } , RL := RI RENAME { i AS l } ) : ( ( ( ( ( RI JOIN RJ ) WHERE j >= i ) JOIN RK ) WHERE k >= j ) JOIN RL ) WHERE l >= k;

That gives me a set of 4-tuples with all of the 495 possible combinations of 1, 2, 3, 4, 5, 6, 7, 8, 9. (Note: combinations, not permutations.)

I have three real relvars with foreign keys referencing Cases and several "exclusion" constraints preventing a tuple from appearing in more than one of those real relvars. Okay, all that could be done with an enumerated type, but it would have been so easy for me to set up and I would have to use invocations of THE_ operators all of the place instead of simple attribute references i, j, k, and l in the queries I use to extract interesting results from this database.

How could I have defined Cases as a user-defined enumerated type instead? And would it be a scalar type or a tuple type? What problems could I have solved that way that I couldn't solve my way? In what ways might my life have been a bit easier with an enumerated type? (It would save having to declare foreign keys, for sure, but what else?)

Incidentally I also have

VAR N VIRTUAL RELATION { n INTEGER } { TUPLE { n 1 } , TUPLE { n 2 } , TUPLE { n 3 } , TUPLE { n 4 } , TUPLE { n 5 } , TUPLE { n 6 } , TUPLE { n 7 } , TUPLE { n 8 } , TUPLE { n 9 } , TUPLE { n 10 } , TUPLE { n 11 } , TUPLE { n 12 } , TUPLE { n 13 } , TUPLE { n 14 } , TUPLE { n 15 } , TUPLE { n 16 } , TUPLE { n 17 } , TUPLE { n 18 } , TUPLE { n 19 } , TUPLE { n 20 } , TUPLE { n 21 } , TUPLE { n 22 } , TUPLE { n 23 } , TUPLE { n 24 } , TUPLE { n 25 } , TUPLE { n 26 } , TUPLE { n 27 } , TUPLE { n 28 } , TUPLE { n 29 } , TUPLE { n 30 } , TUPLE { n 31 } , TUPLE { n 32 } , TUPLE { n 33 } , TUPLE { n 34 } , TUPLE { n 35 } , TUPLE { n 36 } , TUPLE { n 37 } , TUPLE { n 38 } , TUPLE { n 39 } , TUPLE { n 40 } , TUPLE { n 41 } , TUPLE { n 42 } , TUPLE { n 43 } , TUPLE { n 44 } , TUPLE { n 45 } , TUPLE { n 46 } , TUPLE { n 47 } , TUPLE { n 48 } , TUPLE { n 49 } , TUPLE { n 50 } , TUPLE { n 51 } , TUPLE { n 52 } , TUPLE { n 53 } , TUPLE { n 54 } , TUPLE { n 55 } , TUPLE { n 56 } , TUPLE { n 57 } , TUPLE { n 58 } , TUPLE { n 59 } , TUPLE { n 60 } , TUPLE { n 61 } , TUPLE { n 62 } , TUPLE { n 63 } , TUPLE { n 64 } , TUPLE { n 65 } , TUPLE { n 66 } , TUPLE { n 67 } , TUPLE { n 68 } , TUPLE { n 69 } , TUPLE { n 70 } , TUPLE { n 71 } , TUPLE { n 72 } , TUPLE { n 73 } , TUPLE { n 74 } , TUPLE { n 75 } , TUPLE { n 76 } , TUPLE { n 77 } , TUPLE { n 78 } , TUPLE { n 79 } , TUPLE { n 80 } , TUPLE { n 81 } , TUPLE { n 82 } , TUPLE { n 83 } , TUPLE { n 84 } , TUPLE { n 85 } , TUPLE { n 86 } , TUPLE { n 87 } , TUPLE { n 88 } , TUPLE { n 89 } , TUPLE { n 90 } , TUPLE { n 91 } , TUPLE { n 92 } , TUPLE { n 93 } , TUPLE { n 94 } , TUPLE { n 95 } , TUPLE { n 96 } , TUPLE { n 97 } , TUPLE { n 98 } , TUPLE { n 99 } , TUPLE { n 100 } };

and those three real relvars also have foreign keys referencing N.

Hugh

*Coauthor of The Third Manifesto and related books.*

Quote from dandl on October 29, 2018, 2:55 amFWIW as a developer and with my software hat on, there are two principles that I would apply.

- Symbols used in code should never be user-visible. Programmers should always be able to rename symbols to meet new requirements or better assist reading of the code without concern for user-visible changes. Symbols should be translated into user-visible text by a data lookup.
- If something can be done equally well in data or code, then data should always be preferred. Changes in data are almost always simpler, faster and less error-prone than changes to code.
Which leads to the rather unfortunate prospect of having dozens if not hundreds of little lookup tables, very often with a lookup from a symbol to an expansion with the same value. Other than resorting to the OTLT or the http://wiki.c2.com/?GodTable it's not obvious how to avoid this.

FWIW as a developer and with my software hat on, there are two principles that I would apply.

- Symbols used in code should never be user-visible. Programmers should always be able to rename symbols to meet new requirements or better assist reading of the code without concern for user-visible changes. Symbols should be translated into user-visible text by a data lookup.
- If something can be done equally well in data or code, then data should always be preferred. Changes in data are almost always simpler, faster and less error-prone than changes to code.

Which leads to the rather unfortunate prospect of having dozens if not hundreds of little lookup tables, very often with a lookup from a symbol to an expansion with the same value. Other than resorting to the OTLT or the http://wiki.c2.com/?GodTable it's not obvious how to avoid this.

Quote from Dave Voorhis on October 29, 2018, 8:55 amQuote from dandl on October 29, 2018, 2:55 am... Which leads to the rather unfortunate prospect of having dozens if not hundreds of little lookup tables, very often with a lookup from a symbol to an expansion with the same value. Other than resorting to the OTLT or the http://wiki.c2.com/?GodTable it's not obvious how to avoid this.

It's essential, unavoidable complexity, and roughly the same notional complexity whether using multitudes of little lookup tables or multitudes of enumerated types.

It may be helpful to provide syntactic sugar or library routines to wrap the creation of such lookup tables.

Quote from dandl on October 29, 2018, 2:55 am... Which leads to the rather unfortunate prospect of having dozens if not hundreds of little lookup tables, very often with a lookup from a symbol to an expansion with the same value. Other than resorting to the OTLT or the http://wiki.c2.com/?GodTable it's not obvious how to avoid this.

It's essential, unavoidable complexity, and roughly the same notional complexity whether using multitudes of little lookup tables or multitudes of enumerated types.

It may be helpful to provide syntactic sugar or library routines to wrap the creation of such lookup tables.

*I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org*

Quote from dandl on October 30, 2018, 1:19 amSo in practice the solution I see and use in other situations is to have a type which refers to a table. It only works when the the values are not an enumerated type (no special logic for any values). Several kinds.

- The 'list' type. Permitted values are provided by a single delimited list such as
`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries.- As 1, but using a regular expression.
- As 1, but using a piece of code that is evaluated dynamically with the value as an argument, and returns a boolean.
- The 'lookup' type. Permitted values are provided by the rows in a table, such as
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries.- As 4, but using a piece of code that is evaluated dynamically with the value as an argument and returns the replacement value.
These techniques are not easy to apply in SQL (no types) or TTM (no variable table or attribute names, no dynamic code). They work well in dynamic languages (ruby, JS) and NoSQL.

So in practice the solution I see and use in other situations is to have a type which refers to a table. It only works when the the values are not an enumerated type (no special logic for any values). Several kinds.

- The 'list' type. Permitted values are provided by a single delimited list such as
`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries. - As 1, but using a regular expression.
- As 1, but using a piece of code that is evaluated dynamically with the value as an argument, and returns a boolean.
- The 'lookup' type. Permitted values are provided by the rows in a table, such as
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries. - As 4, but using a piece of code that is evaluated dynamically with the value as an argument and returns the replacement value.

These techniques are not easy to apply in SQL (no types) or TTM (no variable table or attribute names, no dynamic code). They work well in dynamic languages (ruby, JS) and NoSQL.

Quote from Dave Voorhis on October 30, 2018, 7:42 amQuote from dandl on October 30, 2018, 1:19 amSo in practice the solution I see and use in other situations is to have a type which refers to a table. It only works when the the values are not an enumerated type (no special logic for any values). Several kinds.

- The 'list' type. Permitted values are provided by a single delimited list such as
`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries.- As 1, but using a regular expression.
- As 1, but using a piece of code that is evaluated dynamically with the value as an argument, and returns a boolean.
- The 'lookup' type. Permitted values are provided by the rows in a table, such as
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries.- As 4, but using a piece of code that is evaluated dynamically with the value as an argument and returns the replacement value.
These techniques are not easy to apply in SQL (no types) or TTM (no variable table or attribute names, no dynamic code). They work well in dynamic languages (ruby, JS) and NoSQL.

These -- to the extent that I understand them -- seem straightforward in TTM, either via CONSTRAINT mechanisms evaluating arbitrary code in a user-defined type for 2, 3, and 5 (?), or one lookup table per requirement for 1 and 4 (and 5?).

Your approach appears to hint at "god tables" in item 1. Are you afraid of creating tables/relvars?

Sometimes hiding database definitions behind abstractions is an effective way to deal with the apparent complexity of defining multitudes of tables and/or column/attribute types.

Quote from dandl on October 30, 2018, 1:19 am

- The 'list' type. Permitted values are provided by a single delimited list such as
`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries.- As 1, but using a regular expression.
- The 'lookup' type. Permitted values are provided by the rows in a table, such as
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries.

These -- to the extent that I understand them -- seem straightforward in TTM, either via CONSTRAINT mechanisms evaluating arbitrary code in a user-defined type for 2, 3, and 5 (?), or one lookup table per requirement for 1 and 4 (and 5?).

Your approach appears to hint at "god tables" in item 1. Are you afraid of creating tables/relvars?

Sometimes hiding database definitions behind abstractions is an effective way to deal with the apparent complexity of defining multitudes of tables and/or column/attribute types.

*I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org*

Quote from Hugh on October 30, 2018, 4:16 pmQuote from Dave Voorhis on October 30, 2018, 7:42 amQuote from dandl on October 30, 2018, 1:19 am

`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries.- As 1, but using a regular expression.
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries.These -- to the extent that I understand them -- seem straightforward in TTM, either via CONSTRAINT mechanisms evaluating arbitrary code in a user-defined type for 2, 3, and 5 (?), or one lookup table per requirement for 1 and 4 (and 5?).

Your approach appears to hint at "god tables" in item 1. Are you afraid of creating tables/relvars?

Sometimes hiding database definitions behind abstractions is an effective way to deal with the apparent complexity of defining multitudes of tables and/or column/attribute types.

For "lookup table" read relation literal, as in the example I gave. I don't think it's a good idea to use the word "type" for something whose set of values can change from time to time.

Hugh

Quote from Dave Voorhis on October 30, 2018, 7:42 amQuote from dandl on October 30, 2018, 1:19 am

`'Up,Down,Neutral'`

, which is stored as a row in a table. The type specifies the table name and the key for the row. It's possible to write a constraint, and no translation is required in queries.- As 1, but using a regular expression.
`{ H: elevated, L: depressed, N: unchanged }`

. The type specifies the table name, key column and looked-up value column. It's possible to write a constraint, and a simple JOIN provides the looked up value for queries.Your approach appears to hint at "god tables" in item 1. Are you afraid of creating tables/relvars?

For "lookup table" read relation literal, as in the example I gave. I don't think it's a good idea to use the word "type" for something whose set of values can change from time to time.

Hugh

*Coauthor of The Third Manifesto and related books.*

Quote from Dave Voorhis on October 30, 2018, 4:24 pmQuote from Hugh on October 30, 2018, 4:16 pmFor "lookup table" read relation literal, as in the example I gave. I don't think it's a good idea to use the word "type" for something whose set of values can change from time to time.

Hugh

Yes. A type should be strictly invariant for the lifetime of a program/database.

A lookup table might never change in practice, but if it can

conceivablychange during the lifetime of a program/database, it confirms that it shouldn't be a type.

Quote from Hugh on October 30, 2018, 4:16 pm

Hugh

Yes. A type should be strictly invariant for the lifetime of a program/database.

A lookup table might never change in practice, but if it can *conceivably* change during the lifetime of a program/database, it confirms that it shouldn't be a type.

*I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org*

Quote from dandl on October 31, 2018, 4:23 amI used the word type intending some degree of generality, but in TTM terms what I'm referring to is more like a

type constraint(RM Pre 23a), or pick some other word if you prefer. What aboutdata value constraint?We still have the same issues. A

data value constraintwould be one that says "the value of attribute A must be one of those specified by the item with key K (or A) in relvar T. The values in relvar T could be changed (subject to database consistency rules) without changing any software. Nothing in the software can ever depend on which values are allowed.We cannot generalise

data value constraintsof this kind because we cannot refer to relvars or their attributes by reference. As per Hugh's comment, we can do something a bit similar, but only by using literals.

I used the word type intending some degree of generality, but in TTM terms what I'm referring to is more like a **type constraint** (RM Pre 23a), or pick some other word if you prefer. What about **data value constraint**?

We still have the same issues. A **data value constraint** would be one that says "the value of attribute A must be one of those specified by the item with key K (or A) in relvar T. The values in relvar T could be changed (subject to database consistency rules) without changing any software. Nothing in the software can ever depend on which values are allowed.

We cannot generalise **data value constraints** of this kind because we cannot refer to relvars or their attributes by reference. As per Hugh's comment, we can do something a bit similar, but only by using literals.