Since I'm following a TDD approach, the implementation of the state validation logic
In a TDD approach, I would expect you to be concentrating your attention on the implementation of behavior, rather than the implementation of state.
Which is to say, that your example of having a test for get_allowed_substates that explores the entire range of inputs is "fine" (I have doubts about the structure of this test: the api you have been "driven" to, and how well the test serves as documentation for how the method is expected to work).
But as a function, what is this for? If the state machine is under the control of your program, then reaching an "invalid" state really means that there has been a programmer error somewhere - that some implementation of a transition in the state machine didn't realize the intended state.
On the other hand, if you are dealing with a source of "untrusted" state data, then you probably out to have a parser somewhere close to the boundary where the untrusted data enters the system, and you should be test driving the parser.
(Now, in fairness, there is a subset of the TDD community that loves starting with small "units" that are driven into existence in isolation -- you might be doing that, but it isn't obvious from this example that you are doing this and keeping a specific client in mind; one of the reasons that I prefer working "outside in" is that you get alignment with the client context "for free".)
This made me realize there is significant overlap between the validation logic within the domain objects and unit tests
Yeah, this kind of thing is common and normal - it has to do with the fact that we have the same predicate being used for two different ideas.
In our parser, we're trying to figure out if some general purpose data structure is aligned with our narrower range of expected values (example: we're receiving information using a general purpose integer data type; but our narrower expectation is that the integer shall always be positive - thus, roughly half of the space of values that fit in the general purpose data type are "invalid", and need not-on-the-happy-path handling).
In our domain types, we're using a constructor to ensure that some data constraint only needs to be checked once, and the validation within the constructor is a defensive programming technique intended to catch errors in our parsing code.
(Warning: untested pseudo-code written in an imaginary programming language)
class TrustedData:
def __init__(untrustedData):
FailFast.programmerError(...) if not valid(untrustedData)
...
def parsing(untrustedData):
if valid(untrustedData):
happyPath( TrustedData(untrustedData) )
else:
# handle the usual case that we've gotten bad information
# from our untrusted source
What we're really doing in the TrustedData constructor is checking that the programmer who wrote the parsing function did the right thing.
If you look through the older literature, the code in the constructor might look instead like
class TrustedData:
def __init__(untrustedData):
assert valid(untrustedData)
Where the assert line would be completely removed from the production code (ex: in C code, those lines would normally be elided by the preprocessor before the source was passed to the compiler).
is writing tests for the domain model redundant, and thus be avoided?
Typically, in DDD, the domain model is the thing that's real; it's where the important logic lives, and that's the code that changes regularly as the needs of the business evolve -- so that's an important area to be able to make changes without introducing faults. TDD's twin goals of (a) improving the design so that faults are introduced less frequently and (b) producing tests so that faults are detected both quickly and cost effectively play well here.
But specifically for the case of something like a programmerError detector, the cost benefit ratio of having "tests" is not quite the same as it would be in code that has a lot of branches.
Horses for Courses
is it a sensible choice to test all possible combinations of superstate and substate
From a TDD perspective? Mostly no, in my experience -- you don't typically learn anything interesting about the design by spamming an infinite number of input combinations.
From a testing perspective? Maybe. See "Property Based Testing" in the literature. My experience is that typing-the-same-thing-in-two-places tests aren't especially valuable, and that the cost benefit ratio gets really bad if the answer isn't stable.
At one end of the spectrum you have "the answers are really stable, but we are constantly changing how we calculate them", and at the other end you have "the answers are always changing, but the calculation is always so simple that there are obviously no deficiencies" -- for the latter case, you may have more cost effective alternatives than maintaining a suite of "automated" tests.