Better value context indication from validators
Description
Environment
Activity

Ian YoungApril 11, 2024 at 10:30 AM
Closing this as good enough for now. We may want to come back and add similar functionality to other leaf validators, but I don’t yet know of other use cases where the same problem is present.

Ian YoungMarch 27, 2024 at 11:44 AM
I’ve tried that out in the UKf testbed and the results are fairly good, and probably good enough. I’ll be discussing that with Phil and Alex next week but for now I’m going to leave this open and move on to other work.

Ian YoungMarch 26, 2024 at 11:39 AM
Commit b8991e4d215f42b03289c5ba0d112f7cfb708dae is a first clean cut at this, after several lengthy false starts.
In this variation, there are two forms of the validate
method, one of which has a @Nullable String valueContext
. Both validate
methods now have default bodies referencing the other; this works as long as every implementing class implements at least one of the methods, or both. It means that implementing classes aren’t littered with almost-duplicate code handling both cases.
Unfortunately I couldn’t get an earlier attempt in which the parameter was @Nonnull
to gel because it was impossible to parse out the difference between not being handed a value and being called through the other method. I might revisit this if I have a further revelation, but we’re probably stuck with it and I think it’s acceptable. I suppose an @Nonnull Optional<String>
would be a related approach but it seems heavyweight in this case.
Most existing validator classes just provide the validate
method without the value context parameter; if they are called with the extra context, it will be discarded and things will be as they were before. Things like ValidatorSequence
implement the extended form and so the extra information can be passed down a stack of ever more granular validators while retaining the outermost context.
The important changes are in BaseAsValidator
and the various URLValidator
classes, which were the motivating use case. BaseAsValidator
--- if called without a valueContext
--- now interpolates the string form of the value being converted to be validated as something else as a value context for sub-validations. This is almost always a String anyway so the .toString
doesn’t do anything. The URL individual validators now rely on a somewhat enhanced collection of formatting helpers in BaseValidator
to prefix the resulting message with the context value if it’s present.
Note that I thought that I was going to have to tweak the two DOM string-validation stages (for element and attribute) so that they passed their value in this way as well. That turns out to be unnecessary, at least in the initial use case, because the same value is capturable by the As
validator they will always be asked to call. I don’t think anything would have to change if they did do that, but there are structural reasons why it’s tricky so I haven’t tested that belief.
That context formatting implementation might need some more work after we look at the actual results in a real deployment, so my next step will be to plug this all into the UKf testbed to see how things change and I might come back for another iteration. In particular, it may be worth giving BaseValidator
another property to describe how to format the context value, and potentially yet another to specify how to plug the context and main message together (there’s already a property controlling the main message’s formatting, but it was introduced in 0.10.0 so it is not fixed in stone just yet).

Ian YoungFebruary 28, 2024 at 6:00 PM
Making the valueContext
be a @Nonnull Object
would allow us to pass in anything that had a toString
, not just a String
.

Ian YoungFebruary 28, 2024 at 5:58 PM
One case that may motivate this is the new validators I’ve added for URL
s. The validator for an empty host component is being asked, essentially, to say whether a string is empty. If it is, it can say “the string was empty, the host name was missing” but it can’t easily say what the whole URL was because it’s not being handed that. The result is a rather thin error message which is much less helpful than we used to have.
That specific case we could fake up a copy of more-or-less the original value by applying toString
to the URL
but that might not be exactly what was in the original document so it’s potentially unhelpful in itself.
If there was a version of the validate
method with another @Nonnull
String
argument, that could be used to provide a better result in this case.
To get that to work, I’d have to add an implementation of that new method to that validator, the “as” validator that calls it and probably also the sequence construct that the iteration is built around so as to preserve it whenever it’s present but still function when it’s not. The alternative is to use a @Nullable
and lots of conditionals, but I think that would be less clean.
The default implementation in the interface would just call across to the degenerate case we already have.
The current
Validator
interface has the following (stripped down) method:Here we have a value
v
to process, anitem
where we can put item metadata such asErrorStatus
, and a stage identifier to use in constructing that item metadata'scomponentId
, normally by combining it with the validator's own identifier (so that we end up with something likestage/validator
).This works well to tell us which component in the configuration has caused the validation to fail, but it doesn't help in determining where in a perhaps large and complex item's content tree the value came from.
An improvement to this model would allow the validator's caller to optionally provide that information, perhaps by passing in a DOM element or attribute name which could then be integrated into the contents of the
ErrorStatus
.I'm spitballing identifiers here, but the idea is that one of the strings would indicate component structure as now (where at least some of the time
stageId
is a misnomer), and the other would be a hint as to where the value came from within the document.One of the two methods (probably the old one? this needs some thought) could be a default method to make it simpler to write validators that don't need this level of detail because the value itself (and the composite component identifier) is sufficient to identify the source.
I'm working on some prototype validation components at present that would benefit from this, but outside the MDA project. In that work, I will probably pass the value context in as part of the "{{stageId}}" value, but that's going to end up with it appearing in strange places. When I come to upstreaming that work, it should be rewritten along these lines.