V3DT conference call notes for Mon, Feb 8, 1999.

The HL7 version 3 data type task group has had its fifteenth conference call on Monday, February 8, 1999, 11:00 to 12:30 EST.

Attendees were:

Wody Beeler
Mike Henderson
Stan Huff
Joann Larson
Mark Shafarman
Greg Thomas
Robin Zimmerman
Gunther Schadow.

Agenda items were:

The Orlando aftermath - new ideas, new directions ...

... you name it. This will be timeboxed to 30 minutes. We go in two rounds:
1. Brainstorming: people will name key words and short descriptions of their issues
2. Sweeping: we will sort things into bundles, and clarify, resolve, or assign for further work.
Real World Instances - what, whether, and, if so, how?

Agenda Item 1: The Aftermath ... new ideas, new directions, etc.

First Round: Brainstorming

Stan Huff

1. We need to be more clear on how such constructs as implicit type conversion or the No Information data type are actually implemented. This means both

how they are represented (e.g., in the XML ITS) and
how they are dealt with computationally (e.g. how can a reasonably knowlegeable programmer implement the type conversion system).

2. We need to specify how we express constraints (syntax and semantics of constraint expressions.) Probably, we'll have special constraint expressions for primitive data types and generic expressions for the various composites or collection types.

3. We need more intuitive representations of many of the composite types, e.g. forms to express an Interval simply as, e.g., "<3.4".

GS: This is going to be handled by character string literals. We want intuitive string literals to be available (independently from the ITS). String literals make most sense for XML ITS, since one of the benefits of XML are the alleged "human readability" of the raw XML source. This readability would be enhanced by literals that are friendly to the human eye, rather than composite structures that just reflect the abstract syntax or semantic components.

Clarifying question by Mark Shafarman: those literals would be defined for all ITSs alike. An ITS may or may not choose to deploy a literal form for a fine grained composite element.

4. Annotation should include codes. With this, Stan points out that in the notes and the Orlando Report some conclusions not actually reflect the consensus of the respective phone conference.

GS: I admit that in some places I diverted from what we settled in the calls. Those areas are:

Annotations, where after reading Stan's list of examples for coded annotations, I was - frankly speaking - so disgusted, that I just could not justify to write down a proposal to allow for this kind of chaos in version 3. (Note that Stan's list is a compilation of what he has found in reality, this is not his suggestions of how things should be done.)
I stepped back from the idea that we would merge the data types for physical quantities and monetary amounts. I made this decision on Friday night before the Orlando meeting. I saw very discuraging practical disadvantages from merging the code systems for units of measures and currency units.

We will of course have to come back to those issues to make sure the proposal reflects the consensus of the task force. I invite everyone to stand and write down a consistent argument that would defeat my choice.

The question of the two data types for physical and monetary quantities was also addressed in my recent public e-mail discussion with Woody Beeler. What it boils down to in part is: when do we deploy domain constraint and when do we define a new data type.

Stan Huff put it this way (and I very much agree): When a data type with a certain constraint is reused in many places, this would suggest to define a top-level type name for the data type with this particular constraint. The advantages are consistency and maintenance. a type name would guarrantee consistency in the way the constraint is expressed, since the same constraint is just reused not rewritten. If a change needs to be made to the constraint, it can be made in just one place and take effect at all points of use.

GS: However, this maintenance issue has a straight flip side: if the same type was used in different places with different intentions and assumptions, and the modification renders any of those assumptions questionable or invalid, then it would have been better to not subsume the particular type-constraint-configuration unto one data type name. This is what I mean that types ought to be clearly defined semantic entities so that you can be sure that they used everywhere with the same intentions and assumptions.

4. Stan believes that the data type models are exactly the way they should be from a semantic point of view. However, practical simplicity in the use of those types could be reached by flattening the models For example, number - interval - measurements - ... - are now expressed through nested generic types, but those structures may suffer difficulty in use.

Namely, the access paths of deeply nested types are quite long, which Stan regards as being a problem. For instance, suppose we defined type T nesting types U and V. The accessor path would be t.u.v insetead of simply t.v if the U-level would be flattened away.

However, Stan agrees that flattening is only possible if multiplicities are 1 to 1.

5. Do we understand the TII now? How do we support the use cases people associated with the "assiging authority" information. E.g., how can I call somebody to ask whether the specimen of my order was good? How can I contact a person in charge of some object? How can I contact a system that hosts a given object?

Woody Beeler

From the e-mail discussion, Woody likes to follow up on two central issues:

1. Representation of DTs as an HMD. Woody accepts my concern about abstract syntax being the normative part of the high level HL7 specification. But he does not consider HMDs to be just abstract syntax definition either. Thus an ITS could use a different abstract syntax than the one suggested by a naive interpretation of an HMD.

An ITS specification of a data type (or any other part of an HMD) could use less components than the HMD specification suggests.

However, conformance and interoperability would require this mapping between different abstract syntaxes to be semantically isomorphic. This means that all the semantic components that we identify in the data type specification (or the HMD) must be recoverable in any of the representations.

With this corrected understanding of the HMD, type definition boxes used in our DT proposal are very similar, if not the same, as HMD tables.

GS: Cool! This means I can safely withdraw my concerns about the HMD being too specific about abstract syntax.

2. The generality of nested generic types and collections should somoehow be constrained, so that we have a bounded set of data types that commitees will use.

For this he likes to subtype the class of generic types into those generic types that apply to all vs. those that only apply to some base data types.

We will follow up on the channel definition example.

R. Woody now believes that the issue titled "once a list, always a list" would be accepted affirmative. And placed as a call for challenging examples. This means, we would assume that the same RIM attribute could not swich between elementary type and collection thereof. Once a collection, always a collection.

Mark Shafarman

1. When domain TCs define composite data types, they should review their approach with Control Query. Control Query's task is it to make sure the new types are consistent with what is already there and that it follows certain technical and style requirements. (GS: yes, however, we need to objectify the style requirements to keep ourselves (i.e. CQ) from making arbitrary decisions on what does and does not go).

2. Constraints on data types can leverage initial field review that occured in the templates group. Some available alternatives are a tabular form vs. language-like expressions of constraints.

Joann Larson

Joann and Mike remembered well the consternations they stired up by pointing out that PAFM actually consider's "mother's maiden name" to be a Stakeholder_identifier. Woody puts it this way: "is stakeholder id a generic attribute for stakeholder or a identifier?"

Second Round: Clustering the issues.

Data types and HMDs. This is one end of the cluster that includes the meta model, and much more. It is the glue between the data type specification and the MDF.
Constraints. Also an MDF related topic. It has it's own interest and is-though quite challenging-a well defined and bounded problem. We feel the emerging need for constraints popping up everywhere. Some field review work has been done in the template group, so we will leverage this for data types. Data types, especially literals, will play an important role in any kind of constraint definition. So, this is our genuine task, we will have to do until April.

How much constraints can be specified in tables, how much of it needs a constraint expression language? While languages are nice, they have the disadvantage of requiring processors. Language surface forms can hide substantial complexity in the underlying semantics, both for better and for worse. The less different tools we need and the more we can handle by existing tools the better for us. Each new language, each new tool will pose problems: it must be tought, it must be supported by software, it must be understood and eventually implemented in HL7 interfaces.

As of now, we don't have a date when we will attack constraints. This is a call for everyone to shop around, and submit ideas of how we should move along.
Simplifications, flattening, constraining the combinations for generic types and collections.

As opposed to Stan, I do not believe that "long" accessor paths are a substantial problem. I think we should collect substantial evidence for whether and why the type structures as they exist now are a problem. I personally don't see a lot of urgency in flattening ans simplifications. However, I am certainly willing to facilitate a discussion. The questions we need to get answered (in writing) are:
- Why are nested structures a problem?
- What is the benefit of nested structures?
- What alternatives exist?
- What makes the alternatives better?
- What are the downsides of the alternatives?
Mark Tucker already contributed to the discussion with his reference to the MIT Lisp Machine Lisp MIXINs (aka, flavors mixed into the vanilla ice cream). I think that those are substantial and considerable alternatives. (I also think that this is the way a considerable proposal for alternatives should look like.)

While discussing the MIXIN idea, Mark and I found that it is both easier to realize than it seemed on the first glance and has some interesting effects. We both concluded then that we would rather pursue the nested structures.

Contributions are welcome. However, we can not furnish a conference call on this without prior written input proposals. Our time is very short and there are lots of other things to take care of.
Detailed and complete ITS specification, at least for XML. This is a high-priority issue. Without this ITS spec we can not implement or test anything. This would be either a separate smallish document (the XML ITS for data types), or - easier for now - specialized sub-sections on every data type. We need to specify both DTDs and examples. If the abstract syntax is not the same as those shown in the type definition boxes, isomorphic transformations must be specified that can unambiguously recover the semantic components.
Submissions are welcome. This can be done in smaller bites: just take one type and write down a piece of DTD and a couple of XML examples.
Detailed and complete specification of literals. Also a high priority issue, if our XML ITS is going to use those literals. Should be a sub-section under each devintion of a data type. If the type is not going to have a literal form, the section should appear anyway explaining why this is so (so as to distinguish from types for which the literal specification is simply not yet done).
More implementation guidance. Not very high priority. Should be added as needed. Everyone is encuraged to give comments, make change proposals to clear up those issues and provide further content to help provide guidance to the users of this data type specification.
How the TII is used. This question may not so much pertain to the TII data type itself but rather to how it is used and what other types and attributes may be needed to support use cases that are kind-of freely associated with the concept of "assigning authority".

This is covered by an ongoing MDF discussion. Next conference call on this is Friday, Feb. 1999, 12 noon EST. Everyone is encuraged to participate. If from this discussion we find new requirements for our data types, we will take care of those in our group.
In the Orlando meeting we came across issues about the code phrase. (1) The code value seems to be redundant when many code values are collected in one code phrase and all code values happen to come from the same code system. (2) It almost seemed as if all code values in one code phrase should come from the same code system anyway? (3) Stan Huff made the important point that codes in a code phrase should have no "role" attached to them. This means, a code in a code phrase should have the same meaning or effect regardless of its position in the phrase. This suggests defining code phrase as a SET (rather than an ordered LIST).
We should come to terms on this rather quickly, since it is a minor point, but stability of the concept descriptor constructs is needed to start implementing and using those types.
We will have to define Maintenance and style rules for defining new types that would define the role of C/Q and would set a basis for CQ (and everyone else) to make objective (non-arbitrary) decisions about petitions for new data types or changes to old ones. This is almost MDF material.

Agenda Item 2: Real World Instances - what, whether, and, if so, how?

My next agenda item would be what we are going to do about real world instances:

PN, XON, AD, PL, DLN, ...

My take on it is this:

do not define data types for: PN, XON, AD, PL, since those appear only once in the RIM and are already modeled (XON, PL) or are being modeled (AD). We should come up with a class definition for person name that is truly international, including all ramifications.

The only thing I am not sure about is DLN et. al. I am almost inclined to recommend making Stakeholder_id a data type. SSN, DLN, Passport-Number, Inventory-Number, etc. are identifiers not only used for stakeholders. For example, the Animal proposal will need an animal identifier ... and Animal would not be a stakeholder.

The other way to go would be to generalize stakeholder id to

class RealWorldInstanceId { type_cd : CodeValue; id_txt : CharacterString; has_issuing_authority :: Stakeholder; }

This would be the ultimate choice, if we need that real association to real stakeholders. In my world, it is not allowed to associate with any RIM class instance from within a data type. In other words, there should be no hidden forreign keys to RIM classes within data types. This would prevent us from doing issuing_authority in the OID way. Since, in real world identifiers, issuing authority is also a real world authority, and I am very interested which one it is.

However, we have a dilemma. We need to pursue PAFM to step back from their idea that "mother's maiden name" is a stakeholder identifier. May be not. But we have to rename that class and make it useful for other kinds of real world instances, like devices, things, animals.

The location stuff will be attacked by the Animal/PublicHealth/Epidemiology proposal. PL is already in the RIM as Patient_location. AD will be part of a generalized location class. So, I see no need for an AD data type. Sounds strange, I know, but it the reasonable thing to do.

Woody Beeler reports that Mark Tucker and Jack Harrington hold that data types are classes where we are not worried about their idetity. If we make classes from these former data types, we could create a set of CMETs for those data type-like classes and it becomes functionally the same as data types.

Woody also suggests that if we make PN and AD classes we should put it into CQ domain instead of PAFM. PAFM is already overloaded with a lot of stuff, and CQ could concentrate better on the internationalization issues.

Mark Shafarman sais that we want to control the numbers of different forms of data types. We want to handle Internationalization right. Whatever we do on those types should remain in the realm of CQ.

Woody suggests (but is usure about this) we could use OIDs for stakeholders that only act as identifier issueing authorities. Every user/organization will have an OID name space anyway, so the real world instance identifier would look similar to a TII, just that the "extension" field would be named identifier_text (or the like) and be required rather than optional. A couple of well known OIDs would be used for assigning authorities that everyone knows of (e.g. Social Sec. Authority, or the like). HL7 could allocate OIDs for those in its own name space.

Joann and Mike will find a specification of postal address forms from the "internatiunal postal union" (IPU). ... They did their homework promptly and reported this very informative publication. However, it seems like the model address cheat sometimes, since they use trivial post box addresses when you would really like to see how a streat address looks like. It leaves a lot to further research or guess. Strawman proposal on external id numbers --- PAFM? CFP to ... next:

Issues to tackle in the next calls are:

Person name
Postal address
Real World Instance identifier

Next conference call is next Monday, Feb 15, 1999, 11:00 AM EST.

Agenda items are:

External Identifier

regards

-Gunther Schadow