SQL is a standard?

#11 · February 22, 2019, 11:32 am

Quote from AntC on February 22, 2019, 7:54 am

Or have I got memory lapse? Was it common at the time SQL started out for equality tests to be (in)sensitive to trailing whitespace?

The COBOL I cut my programming teeth on in the 1960s ignored trailing blanks (the only white space we had at that time).

Hugh

Coauthor of The Third Manifesto and related books.

#12 · February 22, 2019, 12:03 pm

Quote from Dave Voorhis on February 22, 2019, 12:03 pm

Quote from AntC on February 22, 2019, 7:54 am

Was it common at the time SQL started out for equality tests to be (in)sensitive to trailing whitespace?

The first language I was paid to develop business applications in was PC-based dBase II, followed by dBase III. I've fortunately purged myself of memory of most of that language's thoroughly heinous abominations -- of which there were many -- so I dug about on the Web for information about equality, and ran across this bit of (slightly edited for brevity and relevance) horror:

When SET EXACT is ON, trailing blanks in either string are ignored in the comparison. When SET EXACT is OFF (the default), the = operator act like a "begins with" operator: the string on the left must begin with the string on the right.

It is recommended that you leave SET EXACT OFF so that you have the flexibility of doing an "exact" comparison or a "begins with" comparison as needed. By definition, all strings "begin with" an empty string, so when checking if a string is empty, always put the empty string on the left of the equality operator.

... If the string on the right of the = operator is (or begins with) CHR(0) and SET EXACT is OFF, then the comparison always returns true.

The $ operator determines if one string is contained in, or is a substring of, another string. By definition, an empty string is not contained in another string.

The above is from documentation for a modern (!) dBase-alike, so there may be some variation between what's described and how it was, but if memory serves, it was every bit as bad as it sounds. This was a modal language, whose semantics could vary depending on global switches like SET EXACT. Obviously foul beyond belief, but you have no idea how foul until you try to develop modular, reliable, reusable software with it.

I remember being (relatively) delighted at the time to discover that C provided a set of functions (using the term "function" loosely, as C does, to mean a named procedure that may or may not return a value) for string comparisons that did what they said on the tin, and if you didn't like their semantics you could create new functions that did what you wanted.

The equality operator in C (which is spelled '==') applied to strings -- which are pointers to arrays of characters -- only tests that they reference the same block of memory. That means expressions like "Dave" == "Dave" might return false or might return true (or what C uses to represent these) because the operands might or might not be allocated different patches of memory depending on the compiler's design and settings. But at least if you understood the language semantics it wasn't a surprise, and indeed typical modern C compilers will warn you that the result of the expression is unspecified.

The proper (case sensitive) C string comparison (at the time) was strcmp("Dave", "Dave").

In mainstream languages, it wasn't really until C++, Java, and C# that '=' reliably meant (or at least could mean, in the case of C++) equality without a long line of caveats, conditions and warnings.

Quote from AntC on February 22, 2019, 7:54 am

Was it common at the time SQL started out for equality tests to be (in)sensitive to trailing whitespace?

The first language I was paid to develop business applications in was PC-based dBase II, followed by dBase III. I've fortunately purged myself of memory of most of that language's thoroughly heinous abominations -- of which there were many -- so I dug about on the Web for information about equality, and ran across this bit of (slightly edited for brevity and relevance) horror:

When SET EXACT is ON, trailing blanks in either string are ignored in the comparison. When SET EXACT is OFF (the default), the = operator act like a "begins with" operator: the string on the left must begin with the string on the right.

It is recommended that you leave SET EXACT OFF so that you have the flexibility of doing an "exact" comparison or a "begins with" comparison as needed. By definition, all strings "begin with" an empty string, so when checking if a string is empty, always put the empty string on the left of the equality operator.

... If the string on the right of the = operator is (or begins with) CHR(0) and SET EXACT is OFF, then the comparison always returns true.

The $ operator determines if one string is contained in, or is a substring of, another string. By definition, an empty string is not contained in another string.

The above is from documentation for a modern (!) dBase-alike, so there may be some variation between what's described and how it was, but if memory serves, it was every bit as bad as it sounds. This was a modal language, whose semantics could vary depending on global switches like SET EXACT. Obviously foul beyond belief, but you have no idea how foul until you try to develop modular, reliable, reusable software with it.

I remember being (relatively) delighted at the time to discover that C provided a set of functions (using the term "function" loosely, as C does, to mean a named procedure that may or may not return a value) for string comparisons that did what they said on the tin, and if you didn't like their semantics you could create new functions that did what you wanted.

The equality operator in C (which is spelled '==') applied to strings -- which are pointers to arrays of characters -- only tests that they reference the same block of memory. That means expressions like "Dave" == "Dave" might return false or might return true (or what C uses to represent these) because the operands might or might not be allocated different patches of memory depending on the compiler's design and settings. But at least if you understood the language semantics it wasn't a surprise, and indeed typical modern C compilers will warn you that the result of the expression is unspecified.

The proper (case sensitive) C string comparison (at the time) was strcmp("Dave", "Dave").

In mainstream languages, it wasn't really until C++, Java, and C# that '=' reliably meant (or at least could mean, in the case of C++) equality without a long line of caveats, conditions and warnings.

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#13 · February 22, 2019, 12:08 pm

Quote from Dave Voorhis on February 22, 2019, 12:03 pm

In mainstream languages, it wasn't really until C++, Java, and C# that '=' reliably meant (or at least could mean, in the case of C++) equality without a long line of caveats, conditions and warnings.

Though I should add that this doesn't mean there aren't any caveats, conditions and warnings about '=' in these languages, only that the "warnings" line is a bit shorter. :-/

I'm the forum administrator and lead developer of Rel. Email me at dave@armchair.mb.ca with the Subject 'TTM Forum'. Download Rel from https://reldb.org

#14 · June 11, 2019, 6:29 pm

Quote from AntC on February 22, 2019, 7:54 am

I think we should say: SQL's = is not the same thing as any other language's = (or rather == in many cases). I don't think TTM's Prescription for equality is particularly unusual. (We need to go into a diversion about pointer equality vs value equality, I suppose; but that's really a discussion about when pointer equality returns False even though value equality returns True.)

Fortunately that matters only when you deal with mutable objects, not with immutable values. D doesn't have mutable objects, and AFAICT SQL doesn't either (except tables and databases, which are not first-class).

#15 · June 15, 2019, 5:19 am

Quote from johnwcowan on June 11, 2019, 6:29 pm

Quote from AntC on February 22, 2019, 7:54 am

I think we should say: SQL's = is not the same thing as any other language's = (or rather == in many cases). I don't think TTM's Prescription for equality is particularly unusual. (We need to go into a diversion about pointer equality vs value equality, I suppose; but that's really a discussion about when pointer equality returns False even though value equality returns True.)

Fortunately that matters only when you deal with mutable objects, not with immutable values. D doesn't have mutable objects, and AFAICT SQL doesn't either (except tables and databases, which are not first-class).

Muldis Data Language has reference values in addition to regular TTM values; for those, pointer equality IS value equality; two references are = iff they point to the same variable; two references pointing to 2 different variables are not equal even if both variables have appearances of the same value. I believe these semantics are the most faithful to TTM and are the most practical.

The Forum for Discussion about The Third Manifesto and Related Matters