When you have a "special" value (UNK, NA, NULL)... can or should that have a <unit> tag

@rbeyer @michaelaye I didn’t want to create a PVL issue on GitHub so I am trying here. I have a user with an interesting ODL question. Based on PVL, or general ODL best practices, what might you recommend? I think I agree with his assessment (case 1).

-Trent

So I have a question regarding ODL, and PDS3. When you have a “special” value (UNK, NA, NULL)… can or should that have a tag?

We have a situation where a value was coming out: “UNK”

Having the mm outside the quotes was the proximate problem, but the underlying problem is that units really make no sense with a special value. But sometimes units are required. So… which wins in those cases?

Think of it as an ODL question rather than PDS3, since we are still making ODL labels even for 2020 (which is where this came up).

So the options are:

  1. remove the tag for a special value
  2. put the inside the quotes for these cases

Only #1 actually makes sense to me but I wanted to check with the experts.

So I’m going to assume that their use of “tag” means a units expression enclosed by angle brackets like “<km>”.

From section 12.7.3 (ODL/PVL Usage) of the PDS3 Standards Reference, item 11 says:

  1. Unit expressions are only allowed following numeric values (i.e., “DATA_ELEMENT = 7 <BYTES>” is valid. but “DATA_ELEMENT = MANY <METERS>” is not).

So anything that isn’t a number cannot have a units expression. The “special” values that they are talking about (UNK, NA, NULL) may be special to them, but not to PVL or ODL. If they are surrounded by quotes they are “quoted text”, otherwise they would just be “identifier” elements, neither of which can have a “units expression.”

FYI, using the new alpha version of the Python pvl library with its ODL or PDS3 encoder would raise a warning if you tried to encode something that wasn’t a number with a units expression, but that may not be helpful if they aren’t using it.

So actually either moving the units expression inside a quoted string or removing it from coming after a quoted string or an identifier would make it valid ODL.

I’m not sure what the basis is for the statement “sometimes units are required.” That’s maybe something else that is requiring that, PVL/ODL certainly isn’t.

However, if this was a problem I had, I’d go with #1.

thanks for the quick reply. Good to see that a rule even describes this situation.

I don’t have any vested interest in the outcome, but so long as we’re citing PDS3 standards, I wanted to point out that Chapter 17 contains recommendations on how to handle “UNK”, “N/A”, and “NULL” when the field normally contains a numeric value.

The chapter doesn’t deal with units explicitly, but it does recommend using certain numbers to represent “UNK” or “N/A” in fields that are normally of a numeric type. See table 17.1 in linked PDF.
This would imply to me that units could be present, even though the numeric indicator of “UNK” or “N/A” wouldn’t be physically meaningful.

this most likely is the reason why most teams go with no-value numbers like -9999, because it allows to keep the units in while following all standards.
And I rather have my parser fast and filter/mask later for the NAN values (a sequence that is fully supported by the GDAL/rasterio nan-value field, as you folks know) than having a slow parser that needs to deal with suddenly missing units.