V3DT conference call notes for Mon, Dec 7, 1998.

These notes contain the preparation of the next conference call. So please read and think about it.

The HL7 version 3 data type task group has had its eighth conference call on Monday, December 7, 1998, 11 to 12:30 AM EDT.

Attendees were:

Agenda items were:

  1. comments on last conference and notes.
  2. generic data type for intervals (aka. "ranges"), [slide]
  3. generic data type for uncertainty (aka. "probability distributions") [slide]

AGENDA ITEM 1: Comments on last conference and notes.

Comments on lasts conference notes came from Mark Shafarman he suggested that:
  1. the denominator compoent of the Ratio must not be zero. This is a technical correction and the notes of last conference have been updated accordingly.
  2. The disposition of the different categories of "units" (i.e. physical units, monetary units, dosage forms) may not be closed. People tend to write expressions of the form
    number x unit
    The units do have certain common properties (i.e. you can not add meters and seconds or apples and oranges). But there are also important differences. Stan Huff said that in the case of counts reported as number x things counted, the measurement name should contain the "things counted" rather than the units (e.g. in lab data bases one frequently finds units such as "red blood cells" vs. "white blood cells", which is redundant given that the measurement name is reported properly.) We have to revisit this topic again later.

GENERIC DATA TYPES FOR QUANTITIES

AGENDA ITEM 2: Generic data type for intervals (aka. "ranges")

GENERIC DATA TYPE "INTERVAL"

Interval
Generic data type that can express a range of values. Ranges of values are most abundant as ranges of absolute time, used for ordering and scheduling. Note that an interval is not to be used to specify confidence intervals for uncertain values.
GENERIC TYPE
parameter name allowed types description
T OrderedType Any orderd type can be the basis of an interval. It does not matter whether the base type is discrete or continuous or whether any algebraic operators are defined for that type.
component name type/domain optionality description
low T optional The lower boundary.
low open Boolean required Indicates whether the interval is closed or open at the lower boundary. For a boundary to be closed, a finite boundary must be provided, i.e. unspecified or infinite boundaries are always open.
high T optional The upper boundary.
high open Boolean required Indicates whether the interval is closed or open at the high boundary. For a boundary to be closed, a finite boundary must be provided, i.e. unspecified or infinite boundaries are always open.

NOTES

There was a considerable discussion on whether or not the interval notation is appropriate for things like a set A

A = { xi | xi < n }
i.e. the set of numbers xi with all xi less than n.

Stan Huff wants to specify a range as a composite of

The arguments for Stan's alternative are
  1. people "commonly expect to see" values such as Glucose: <6 mg/dl as a lab result,
  2. a result for diastolic blood pressure such as <60 mm Hg would be more appropriate than 40-60 mm Hg, or that 40-60 mm Hg would be a less appropriate expression that would have to be forbidden anyway,
  3. one can "calculate" with entities consisting of a number and a relational operator better than with intervals,
  4. the operator-number pair construct would be "more intuitive",
  5. in the interval notation people would erroneously use only the low boundary component even though they want to express exact values like 5 or a range bounded only on the right side.
The arguments for the above stated interval construct were:
  1. it is a general form to specify any continuous set of quantities,
  2. it readily contains the alternative suggested by Stan Huff,
  3. it is a uniform representation that is more easy to compute with, because you find the upper boundary always at the same place.

In this discussion, I felt that we (myself included) mixed two orthogonal axes of concepts and a couple of independent issues all together, which was not very helpful.

First: As always, we must distinguish between the any surface level representation of a type versus the semantics of the type. As always this caveat was disregarded.

Second: There are three different uses of ranges on the table that may be intuitively described as

  1. a set of values, where each value may apply under some circumstances (e.g. an order scheduled to begin at 3:15 and end at 4 o'clock);
  2. one single value supposed to assume just one value from the range of values given (e.g. a measurement which turns out to be off the lower absolute limit and therefore can be reported only as a range with an upper boudary);
  3. one single value whose set of possible values is partitioned into equivalence classes because the exact differences are not interesting or not measurable (e.g in microbiologic susceptibility testing, we may have a parameter "OXACILLIN SUSC" where only the following equivalence classes are of interest: > 8.0 µg/ml (not susceptible); 4.0±2 µg/ml (limited susceptibility); and < 2.0 µg/ml (susceptible)).

Miscellaneous issues are: