DRY — "ThirdNormalForm is the analogous principle for data." (?) – Page 2

#11 · October 20, 2025, 7:06 pm

"

A CeeLanguage .h file contains type specifications for functions, typedefs, etc., but
no executable code.

"

I don' think the CeeSpec defines any such rule. I could be wrong, but I've always seen #include as merely a text macro facility, so if it contains executable code, then executable code is what gets included.

Certainly the PL/1 INCLUDE facility is like that.

Author of SIRA_PRISE

#12 · October 20, 2025, 7:18 pm

"Using the POOD, how to account for a contradiction? between the stand-alone type signature -- which is the sort of fact that could be stated in a RelVar, vs an inferred type -- which is not 'stated' anywhere but is implied by a function body via rules of type inference."

I'd say R1 is a relvar [holding a relation] that tells us the "stated" signature, and R2 is a relvar (perhaps a virtual one ??? - though that probably gets us out of the POOD's intended scope of applicability) [holding a relation] that tells us the "inferred" signature, and equality must hold between the two. I don't know how or why "attribute renamings" would fit in, and ditto for "subsets of attributes", and ditto for restriction conditions c1 and c2.

Author of SIRA_PRISE

#13 · October 21, 2025, 12:17 am

Is this a variation on cache invalidation? Any beginner can save derived or duplicate data and mess things up. The pros will treat this sort of thing as a cache, and then have subtle bugs in maintaining that cache. IIRC, the idea that an RDBMS maintains 'physical denormalisation' for performance, is the idea that this is done along declarative lines with a well tested engine. I think that's the generalisation - whether you have a robust system for maintaining your duplicated or cached expensive calculation data, or it's all using a bespoke ad hoc method specific to an individual code base.

#14 · October 21, 2025, 8:55 am

Quote from Erwin on October 20, 2025, 7:06 pm

...

I don' think the CeeSpec defines any such rule. ... #include as merely a text macro facility, ...

Weeell to answer the question first.

#include is not part of CeeLanguage. It's a directive for cpp, the so-called C PreProcessor. Yes, that's just a macro facility.
So cpp knows nothing of the syntax/usage of C (.c, .h files) nor any other language, and you could (and typically do) use cpp to mash together all sorts of sources, languages, config/command text.
I'm not expecting a tool dating back to 1973 to embody DRY thinking nor any other principled approach to software design.
Yes you could #include .c text into a .c file. The internet (StackOverflow, Quora, W3 Schools, various geeks tutorials) has plenty of opinions about the advisability of doing that -- suggest you go look it up for yourself [strong language warning].

I was describing (my understanding of) best practice (or at least standard practice). Yes multiple #includes is one form of footgun in C, amongst many. I wasn't wishing to get into a critique of C nor engage in language wars. I was raising what looked like a violation of DRY. So thank you for failing to answer the question.

#15 · October 21, 2025, 9:15 am

Quote from Jake Wheat on October 21, 2025, 12:17 am

Is this a variation on ...

Hi Jake, DRY (as I understand it) is focussing on the artefacts declared/defined in the program text, config, files, as authored by the system designers.

A data warehouse will use those definitions to (probably) map to a denormalised/repeated form. That doesn't count as anti-DRY, because the form is an automated consequence of declarations/schema declared once and once only.

... cache invalidation?

It's okay to have mechanical, textual duplication (the equivalent of caching values: a repeatable, automatic derivation of one source file from some meta-level description), as long as the authoritative source is well known. [from the link I mentioned]

... then have subtle bugs in maintaining that cache.

If maintaining the cache needs human action, that sounds like its not an automatic derivation, so offends against DRY.

IIRC, the idea that an RDBMS maintains 'physical denormalisation' for performance, is the idea that this is done along declarative lines with a well tested engine. I think that's the generalisation - whether you have a robust system for maintaining your duplicated or cached expensive calculation data, or it's all using a bespoke ad hoc method specific to an individual code base.

Ideally the performance-oriented repeated content to an RDBMS or dw automatically refresh from the transactional data capture. I suspect that relying on an application to write to a log file is frowned on, or even database triggers.

But of course if your transactional system is locked into 1990's db architecture because upgrading is too much of a risk to the business (looking at you Banking, Airlines), there'll have to be compromises.

#16 · October 22, 2025, 5:42 am

Quote from AntC on October 21, 2025, 8:55 am

#include is not part of CeeLanguage. It's a directive for cpp, the so-called C PreProcessor. Yes, that's just a macro facility.

The pre-processor directives including #include are part of the C language as defined by the relevant language standard. The means by which this is achieved are not. A conforming implementation must process these directives, but need not do so by means of a separate pre-processor.

Andl - A New Database Language - andl.org

#17 · October 22, 2025, 1:24 pm

I may have missed the point if the focus of the discussion is only on programming language headers and configuration.

I'd say there were three categories

1. you're repeating something, but this is down to bad design somewhere - e.g. in a distributed system, manually repeating config on each node that should be part of a quorum component

2. you're repeating something manually, but this is because you are defining a contract between two components of some kind. You really want this to be checked - and for it to be clear when that happens and what happens if the check fails

3. you're repeating something manually, and if there are inconsistencies, all sorts of weird stuff happens because there isn't checking. This is also a design failure

Then there's how C headers and link time name availability work (not sure what the proper technical term is). "Not even wrong" is the appropriate label here IMO.

Not sure I completely follow it, but I think your variant example boils down to deciding to what degree you want static typing, or unityping (aka "dynamic typing")? I think this is a question of what kind of contract you want between two components.

(My comment about maintaining a cache was about contrasting a robust system like declarative storage optimisation in a rdbms, and project specific cache maintenance code.)

TTM Forum

The Forum for Discussion about The Third Manifesto and Related Matters

DRY -- "ThirdNormalForm is the analogous principle for data." (?)