WhitespaceTrimmingVisitor should trim more whitespace characters
Basics
Logistics
Basics
Logistics
Description
The WhitespaceTrimmingVisitor is used to remove whitespace from the beginning and end of XML elements and attributes.
The current implementation uses the Java String#trim method, which is defined to remove characters whose character codes are 0x20 or less. This doesn't exactly match the definition of white space either in XML 1.0 (where whitespace is any of SPC, TAB, CR and LF) or XML 1.1 (which adds NEL (U+0085) and Unicode line separator (U+2028).
You might think that Java 11's String#strip would be the solution, but its definition is in terms of Character.isWhiteSpace, which includes a few more control characters. It might nevertheless be an improvement in reaching the original intent of this class.
Investigate and figure out which to use. A change to use String#strip would have to wait for a version of the MDA which required Java 11. The current intention is that this should be true for 1.0, but it may also be true for 0.10, depending on how that comes about.
As the baseline is now Java 17, String#strip is now available. It is an improvement: among other things it does include U+2028 in the characters it will strip. However, U+0085 is NOT included, although I don’t really understand why not (it’s possibly related to the fact that its “general category” is “control”).
The
WhitespaceTrimmingVisitor
is used to remove whitespace from the beginning and end of XML elements and attributes.The current implementation uses the Java
String#trim
method, which is defined to remove characters whose character codes are 0x20 or less. This doesn't exactly match the definition of white space either in XML 1.0 (where whitespace is any of SPC, TAB, CR and LF) or XML 1.1 (which adds NEL (U+0085) and Unicode line separator (U+2028).You might think that Java 11's
String#strip
would be the solution, but its definition is in terms ofCharacter.isWhiteSpace
, which includes a few more control characters. It might nevertheless be an improvement in reaching the original intent of this class.Investigate and figure out which to use. A change to use
String#strip
would have to wait for a version of the MDA which required Java 11. The current intention is that this should be true for 1.0, but it may also be true for 0.10, depending on how that comes about.