The HL7 version 3 data type task group has had its twentyfirst conference call on Monday, March 15, 1999, 4:00 PM EST to 5:30 PM EST.
Attendees were:
Agenda items were:
This has been the final call on those PAFM related issues. We will collected "homeworks" and walked over the data types once again to make sure we solved the open issues. This has been a complex subject matter and we may realize later on that we will have to come back to it. But for the time being we'll stick to the specification summarized here. The notes of today will go into the data type report for Toronto.
The HL7 v2 person name data types (PN, XPN) have basically the same problems as the data type for addresses. I.e., they try to make slots for data so that whatever name parts exist must be fitted in one of the available slots. This has the same disadvantages: that name part types do not classify in a simple and interchangeable way throughout all cultures, but still everyone must use the same classification. Second problem is that the meaning of a name part and the positioning of a name part are orthogonal (independent) aspects of a name. As an additional problem, person names may occur in different ordering and some name parts are or are not used depending on the use case (e.g., formal vs. familiar style).
The decisions made here were informed by the following references:
Bidgood DW Jr, Tracy WR. In search of the name. Proc Annu Symp Comput Appl Med Care, 1993; p. 54-58.
Bidgood DW Jr, Tracy WR. ANSI HISPP MSDS: COMMON DATA TYPES for harmonization of communication standards in medical informatics. Final Draft. 10/30/1993. Available as Postscript or Word.
Hopkins R. Strategic short study: names and numbers as identifiers. CEN TC251. Available as PDF or Word. Note especially Appendix B: National Name Forms by Arthur Waugh, Australia.
We first present the proposed data structure for person name and then we will show examples, discuss ramifications, and justify why this particular design has been chosen.
Earlier discussions included class person name and person name variant, but we found the requirement to model person name as a RIM class. What we did not realize is that, similar to the stakeholder id, our RIM class already exists, it only needs to be polished.
The RIM class Person_name will be developed from the class Person_alternate_name of RIM 0.88 jointly with PAFM. A person may have multiple instance of the person name class, reflecting the multiple names the person is or was known by.
Within this RIM class, there is a code that indicates what purpose a given name is to be used for. Most people in the world will have one name that is currently used.
SYMBOL | SHORT | DESCRIPTION |
---|---|---|
normal | N | The name normally used. May be restricted through validity time intervals. |
license | L | Name not normally used, but registered on some record, license or other certificate of professional or academic credential, but that is not normally used (includes birth certificates, school records, degrees & titles, and licenses.) |
artist | A | An artist's pseudonym includes "stage name", writer's name. |
indigenous | I | Indigenous or tribal names, such as existing abong native Americans and Australians. |
religious | R | Name adopted through practice of religion. For example, "Father Irenaeus," "Brother John," or "Sister Clementine" are religious names that persons adopted through entering an order or assuming a religious office or both. |
There is also a way to specify the validity time of a name.
This class also contains a representation of a single name variant as a list of person name parts that may or may not have semantic tags.
Those RIM changes will have to be discussed jointly with CQ and PAFM at the Toronto meeting in April 1999. We will seek definite closure on the issue in Toronto after which Harmonization will be but a formal issue, since all relevant parties will have agreed to one proposal.
Person Name Part | |||
---|---|---|---|
This type used in the RIM class Person_name that will be developed from the class Person_alternate_name of RIM 0.88 jointly with PAFM. Person names contain a token list. Tokens usually are character strings but may have a tag that signifies the role of the token. Typical name parts that exist in about every name are given names, and familiy names, other part types may be defined culturally. | |||
component name | type/domain | optionality | description |
value | Character String | mandatory | The value of a name part. |
classifiers | SET OF Code Value | optional | Classifications of a name part. One name part can fall into multiple categories, such as given name vs. familiy name and name of public records vs. nickname. |
SYMBOL | SHORT | DESCRIPTION |
---|---|---|
Axis 1 This is the main classifier. Only one value is allowed. | ||
given | G | Given name (don't call it "first name" since this given names do not always come first) |
family | F | Family name, this is the name that links to the genealogy. In some cultures (e.g. Eritrea) the family name of a son is the first name of his father. |
prefix | P | A prefix has a strong association to the immediately following name part. A prefix has no implicit trailing white space (it has implicit leading white space though). Note that prefixes can be inverted. |
suffix | S | A suffix has a strong association to the immediately preceeding name part. A prefix has no implicit leading white space (it has implicit trailing white space though). Suffices can not be inverted. |
delimiter | D | A delimiter has no meaning other than being literally printed in this name representation. A delimiter has no implicit leading and trailing white space. |
Axis 2 Name change classifiers decribe how a name part came about. More than one value allowed. | ||
birth | B | A name that a person had shortly after being born. Usually for familiy names but may be used to mark given names at birth that may have changed later. |
unmarried | U | A name that a person (either sex) had immediately before her/his first marriage. Usually called "maiden name", this concept of maiden name is only for compatibility with cultures that keep up this traditional concept. In most cases maiden name is equal to birth name. If there are adoption or deed polls before first marriage the maiden name should specify the last family name a person acquired before giving it up again through marriage. |
chosen | H | A name that a
person assumed because of free choice. Most systems may not track
this, but some might. Subsumed in the concept of "chosen" are
pseudonyme (alias), and deed poll. The difference in civil
dignity of the name part is given through the R
classifier below. I.e. a deed poll creates a chosen name of record,
whereas a pseudonym creates a name not noted in civil records.
|
adoption | C | A name that a person took on because of being adopted. Adoptions may happen for adults too and may happen after marriage. The effect on the "maiden" name is not fully defined and may, as always, simple depend on the discretion of the person or a data entry clerk. |
spouse | M | The name assumed
from the partner in a marital relationship (hence the
"M "). Usually the spouse's familiy name. Note that no
inference about gender can be made from the existence of spouse
names. |
Axis 3 Additional classifiers. More than one value allowed. | ||
nick | N | Indicates that the name part is a nickname. Not explicitly used for prefixes and suffixes, since those inherit this flag from their associated significant name parts. Note that most nicknames are given names although it is not required. |
callme | C | A callme name is (usually a given name) that is preferred when a person is directly addressed. |
record | R | This flag indicates that the name part is known in some official record. Usually the antonyme of nickname. Note that the name purpose code "license" applies to all name parts or a name, whereas this code applies only to name name part. |
initial | I | Indicates that a name part is just an initial. Initials do not imply a trailing period since this would not work with non-Latin scripts. Initials may consist of more than one letter, e.g., "Ph." could stand for "Philippe" or "Th." for "Thomas". |
invisible | 0 (zero) | Indicates that a name part is not normally shown. For instance, traditional maiden names are not normally shown. Middle names may be invisible too. |
weak | W | Used only for prefixes and suffixes (affixes). A weak affix has a weaker association to its main name part than a genuine (strong) affix. Weak prefixes are not normally inverted. When a weak affix and a strong affix occur together, the strong affix is closer to the its associated main name part than the weak affix. |
Axis 4 Additional lassifiers for affixes. Usually only one value allowed per affix. Classification does not try to be complete. | ||
voorvoegsel | VV | A dutch "voorvoegsel" is something like "van" or "de" that might have indicated noblety in the past but no longer so. Similar prefixes exist in other languages such es Spanish, French or Portugese. |
academic | AT | Indicate that a prefix like "Dr." or a suffix like "MD" or "PhD" is an academic title. |
professional | PT | Primarily in the British Imperial culture people tend to have an abbreviation of their professional organization as part of their credential suffices. |
noblety | NT | In Europe there are still people with noblety titles. German "von" is generally a noblety title, not a mere voorveugsel. Others are "Earl of" or "His Majesty King of ..." etc. Rarely used nowadays, but some systems do keep track of this. |
Names contain white space. The white space rules used in typestetting are not trivial. In general two name parts are separated by white space. An interpuction mark, like a komma or period follows directly to the preceding non-whitespace stuff, but those marks are always followed by whitespace. Dashes are not surrounded by whitespace at all. Note the whitespace rules do not really exist for languages such as Thai or Japanese where white space is basically not used. However, you can always simply ignore whitespace, which is why Thai and Japanese are easier to print. In any case, neither Thai nor Japanese would have whitespace where it was not allowed in Latin script.
The difficult whitespace rules can, for the purpose of the person name data type, be broken down into the following precise rules:
White space never accumulates, i.e. two subsequent spaces are the same as one.
Literals may contain explicit white space subject to the same white space reduction rules.
Except for prefix, suffix and delimiter name parts, every name part is surrounded by implicit white space. Leading and trailing explicit whitespace is insignificant in all those name parts.
Delimiter name parts are not surrounded by any implicit white space. Leading and trailing explicit whitespace is significant in in delimiter name parts.
Prefix name parts only have implicit leading white space but no implicit trailing white space. Trailing explicit whitespace is significant in prefix name parts.
Suffix name parts only have implicit trailing white space but no implicit leading white space. Leading explicit whitespace is significant in suffix name parts.
This means that all address parts are generally surrounded by white space, but white space does never accumulate. Delimiters are never surrounded by implicit white space, prefixes are not followed by implicit white space and suffixes are not preceeded by implicit white space. Every whitespace contributed by preceeding or succeeding name parts around those special name parts is discarded, whether it was implicit or explicit.
Irma Jongeneel, of HL7 the Netherlands, has many nice ramifications in her name, so we will dwell a little bit on her name. Irma has two given names "Irma" and "Corine". In her childhood her family name was "de Haas". Then Irma married Gerard Jongeneel. In Holland both spouses can choose to use either or both of their familiy names in arbitrary order. For the public records Irma chose the combination "Irma Corine Jongeneel-de Haas". But we know her by the name "Irma Jongeneel", i.e. for casual cases she assumed the family name of her spouse. But if Irma would have to show up in a court of law and her name was cited, she would be called "Irma Corine de Haas e.g. Jongeneel" where "e.g." stands for "echtgenote van" meaning "spouse of".
Let's write down the variants that we know now in the familiar instance notation.
First the name by which we know her
Irma JongeneelJust as with the address we have to take care about spacing. When the name is to be printed we usually have the name parts separated by white space. But there are notable exceptions which we will encounter in the following example.(LIST (PersonNamePart :value "Irma" :classifiers (SET given record)) (PersonNamePart :value "Jongeneel" :classifiers (SET family record spouse)))
The following is the name of her marriage record (?)
Irma Corine Jongeneel-de HaasNote that the dash "-" is printed without leading and trainling white space. This is signified by the flag delimiter in the name classifier set. We know this flag already from the from the Address data type. Since names never have line breaks, this line break feature does not exist with delimiters in person names.(LIST (PersonNamePart :value "Irma" :classifiers (SET given record)) (PersonNamePart :value "Corine" :classifiers (SET given record)) (PersonNamePart :value "Jongeneel" :classifiers (SET family record spouse)) (PersonNamePart :value "-" :classifiers (SET delimiter)) (PersonNamePart :value "de Haas" :classifiers (SET family record birth)))
There is a problem with the "de" that is classified as a voorvoegsel in dutch. Another very common voorvoegsel is "van" as in "van Soest". This Dutch "van" is not actually a noblety prefix, although it sounds like it used to be one. Such prefixes exist in many languages, including, French, German, and Portugese.
The problem with such prefixes is that they belong to exactly one other name part, e.g., "Haas". In Dutch the part "Haas" of "de Haas" is called the significant part of that family name, since it is significant for alphabetic sorting. Since "de" can not occur without "Haas" and "Haas" will not occur without "de" both are linked stronger than "de Haas" and "Jongeneel".
One way to handle this associativity is through nesting. With parentheses we could write "(Irma (de Haas) Jongeneel)" to show that "de" and "Haas" are associated stronger than the other parts. However, nesting is costly as it leads to significant additional complexity in the data type definition. Not that nesting is a bad idea per se. However, since the nesting depth appears to be limited to three levels, the generality of nesting seems to not outweigh the wimplicity of a simple linear list.
There are other ramifications though, such as prefixes that consist of more than one part such as in French "Eduard de l'Aigle". Here "de l'" is one prefix that consists of two parts and that connects to the significant part without spacing. To make things more complex we have to realize that "de l'Aigle" is in fact a contraction of "de-la-Aigle". But we decide not to deal with this kind of lexical variations. It is probably safe to consider "de l'" as one prefix that binds strongly to the following significant name part.
Thus we could go without nesting by using special name part flags "prefix". Prefix means that this name part binds strongly to the following name part and we consider it to bind without space. Let's try how that feels:
de HaasNote that "de " contains a literal space. Alternatively we could define flags for prefix-with-space and prefix-no-space, but this would just make things more complex. As a rule we say that name part prefixes associate without space to the following name. If a space is required, it must be included in the name part. See the white space rules above.(LIST (PersonNamePart :value "de " :classifiers (SET prefix)) (PersonNamePart :value "Haas" :classifiers (SET family)))
Eduard de l'Aigle has a prefix that includes no space
Eduard de l'Aigle(LIST (PersonNamePart :value "Eduard" :classifiers (SET given)) (PersonNamePart :value "de l'" :classifiers (SET prefix)) (PersonNamePart :value "Aigle" :classifiers (SET family record)))
This method is challenged when we want to capture a inverted name form such as "Haas, de, Irma" used in a phone book or in bibliographies. Here we lose the strong association between to the prefix."de" and the its significant name "Haas". The prefix is postponed after the significant name "Haas", there is even an intermittent comma, and, to make things even worse, the spacing of "de" is different ("de" vs. "de "). It's a matter of finding the most elegant solution. You can always argue about elegance of course.
Haas, de, IrmaHere we say that the prefix "de " (with trailing space!) is inverted. The computer knows now that the prefix is associated with some preceeding stuff. The rule is: An inverted prefix associates to the nearest preceeding name part that is not a delimiter. Furthermore, the rule for printing the name is: Trailing literal white space is to be removed from inverted prefixes.(LIST (PersonNamePart :value "Haas" :classifiers (SET family)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "de " :classifiers (SET prefix inverted)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "Irma" :classifiers (SET given)))
For Eduard de l'Aigle this works likewise:
Aigle, de l', Eduard(LIST (PersonNamePart :value "Aigle" :classifiers (SET family)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "de l'" :classifiers (SET prefix inverted)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "Eduard" :classifiers (SET given)))
To completely cover all ramifications we can further undo the contraction "de l'A..." to "de la":
Aigle, de la, EduardHowever, this decomposition and contraction of "de la <vowel>" to "de l'<vowel>" and vice versa is outside the scope of HL7. This is rarely taken proper care of even in phone books or bibliographic databases so that hardly any HL7 application will need to care.(LIST (PersonNamePart :value "Aigle" :classifiers (SET family)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "de la" :classifiers (SET prefix inverted)) (PersonNamePart :value ", " :classifiers (SET delimiter)) (PersonNamePart :value "Eduard" :classifiers (SET given)))
As we said earlier, when Irma shows up in a court of law, she might be called
Irma Corine de Haas e.g. Jongeneel(LIST (PersonNamePart :value "Irma" :classifiers (SET given record)) (PersonNamePart :value "Corine" :classifiers (SET given record)) (PersonNamePart :value "de " :classifiers (SET prefix))) (PersonNamePart :value "Haas" :classifiers (SET family record birth))) (PersonNamePart :value "e.g." :classifiers (SET prefix weak)) (PersonNamePart :value "Jongeneel" :classifiers (SET family record spouse))
The "e.g." behaves pretty much like a prefix. It is not "significant" it associates with the following name part. The difference is that the association is weak. A weak association of a prefix or suffix means that the prefix might be dropped. It is still a prefix, which means that it moves wherever the following name part moves, but a weak prefix could be omitted.
Note that a weak prefix may be followed by a (strong) prefix, such as in "Gerard Jongeneel e.g. de Haas". Note also that if a weak prefix is followed by a name part which in turn is followed by an inverted (strong) prefix, the inversion would be undone by insertion of the (strong) prefix between the weak prefix and the significant name part. Contemplate "Jongeneel, Gerard e.g. Haas, de" as an example.
In "Claudine de l'Aigle née Dubois" and "Dorothea Schadow geb. Riemer" "née" and "geb." formally behave just like the "echtgenote van", i.e. they are weak prefices. However, note that the semantics is reversed. Echntgenote van means "spouse of" while née and geborene means "born" in French and German respectively.
Claudine de l'Aigle née DuboisThe semantic difference between née and e.g. is not important since the classification of name parts into birth vs. spouse are non-ambiguous.(LIST (PersonNamePart :value "Claudine" :classifiers (SET given record)) (PersonNamePart :value "de l'" :classifiers (SET prefix))) (PersonNamePart :value "Aigle" :classifiers (SET family record spouse))) (PersonNamePart :value "née" :classifiers (SET prefix weak)) (PersonNamePart :value "Dubois" :classifiers (SET family record birth))
Let's play a little bit with nicknames. I know Bob Dolin as "Bob", but at HL7 he is enrolled as "Robert Dolin" and on papers he calls himself "Robert H. Dolin". This is no big deal, since we have three distinct name forms that we decided to threat as separate Person names without trying to relate those name parts accross the variants.
The following is the first example of a complete Person Name structure.
Bob Dolin, Robert Dolin, or Robert H. Dolinwe did not classify the person name variants here, since this would open up another can of worms. It almost seems like there is a gradual scale of formality which tells which of the various person names to use.(SET (Person_name :value (LIST (PersonNamePart :value "Bob" :classifiers (SET given nick)) (PersonNamePart :value "Dolin" :classifiers (SET family)))) (Person_name :value (LIST (PersonNamePart :value "Robert" :classifiers (SET given)) (PersonNamePart :value "Dolin" :classifiers (SET family)))) (Person_name :value (LIST (PersonNamePart :value "Robert" :classifiers (SET given)) (PersonNamePart :value "H." :classifiers (SET given initial)) (PersonNamePart :value "Dolin" :classifiers (SET family)))))
Degrees of formality may be relevant, but are not yet handled in the HL7 data type. Other examples are: sloppy (Kiki), familiar (Kathy), nick (Kathrin), of record (Katharina) highly official (Ekatharina). We need input from Japan on that. Note also the "Bob Dolin" example above.
Let's take Woody Beeler. Woody is known as "George (Woody) W. Beeler" in the HL7 membership data base. This parenthesis is an interesting construct that we might want to cover a bit more semantic and a bit less literal. The way Woody would pronounce this example is probably: "My name is George W. Beeler, but call me Woody." The parentheses are just a style to print the name badge. Actually the HL7 name badge looks like:
We do not allow line breaks in person names, instead of literal parenthesis or line breaks, we suggest a semantic markup using the callme name part classifier.
Woody George W. Beeler
George (Woody) W. BeelerTwo different applications could now use the same name variant to produce a name badge for an HL7 meeting and to print the HL7 membership directory. The rule for the badge application is: if there are "callme" name parts, print those in big and fat, and print all the other names below, except those names that are classified only as "callme". For the electronic membership directory the rule would be: print all names in order and use put callme-only name parts in parentheses.(LIST (PersonNamePart :value "George" :classifiers (SET given)) (PersonNamePart :value "Woody" :classifiers (SET callme)) (PersonNamePart :value "W." :classifiers (SET given initial)) (PersonNamePart :value "Beeler" :classifiers (SET family)))
Let's take some example where we just can't classify the names. Consider "Iketani Sahoko". Of course, if you know some Japanese you will know that Sahoko is a Japanese female and "Iketani" is her familiy name. But let's assume you don't know that :-). All you have is an unconscious girl wo has the name "Iketani Sahoko" printed (in latin letters) somewhere on her purse.
Iketani SahokoYou now send this name without any classifier. The point is that you can not tell which one is the given name and which one is the familiy name. If you guess from the order (given name = first name) you are wrong. So, if in doubt, why being forced to guess? Of course, most data bases will force you to guess. But this wild guess can be done by the receiving HL7 interface just as well as by a unknowledgeable human. Later, when you learn more about your ptient, you can enter the correct classification:(LIST (PersonNamePart :value "Iketani") (PersonNamePart :value "Sahoko"))
Iketani Sahoko(LIST (PersonNamePart :value "Iketani" :classifiers (SET family)) (PersonNamePart :value "Sahoko" :classifiers (SET given)))
The XPN data type of HL7 version 2.3.x may serve as a validation to see what other name types or name part types may be needed. Of course, there is also the issue of compatibility between version 2 and version 3 of HL7.
code | meaning |
---|---|
A | alias |
L | legal |
D | display |
M | maiden name |
C | adopted (name acquired through the person being adopted) |
B | name at birth |
P | name of spouse (name taken from) |
U | unsepcified |
One problem that we have mapping those name type is that our new person name type id structurally different from the old one. It is not possible, therefore, to simply reuse those codes without further thoughts.
The first issue is that the old person name had a bunch of fixed slots and a name type code affecting the interpretation of data found in all slots. Our new type has name parts wich are individually classified and it has a purpose code for name variants which affect all name parts of the name variant. The semantics of the name parts, i.e. what those parts are, is described entirely in the name part classifiers. Each name variant has a certain use case, purpose or context.
Let's go once again through the table of v2.3.x name type codes trying to determine whether those codes stand for an inherent meaning of a name (part) or its purpose. I'll also make other annotations that might be helpful in sorting things out.
code | meaning | comments | |
---|---|---|---|
A | alias | purpose, a person uses different aliases or pseudonymes in different contexts (i.e. when refering to himself as an author of a book, an actor, your friend, a customer in a bank, or a patient in a hospital. | |
L | legal | purpose, this is the name of public record (if any) Such records do not exist in all countries. In Germany legal names definitely exist, I am not so sure about the U.S. | |
D | display | purpose: for the purpose of "displaying"; however, this is quite vague. See below. | |
M | maiden name | inherent meaning, but there are also quite pragmatic implications. See below. | |
C | adopted | inherent meaning | |
B | name at birth | inherent meaning | |
P | name of spouse (name taken from) | inherent meaning | |
U | unsepcified | ?? (obsolete) |
We have not retained the term "alias," for three reasons. First, one main assumption of our new approach to person names is to support different name variants, where every variant is baiscally an alias for a person. Thus there is no need to further qualify that. Second, the term "alias" has a negative connotation (e.g., only thieves and other bad guys need aliases.) Third and finally there are different kinds of pseudonymes that we may want to indicate positively, i.e. artist's names (writer and stage names), indigeous (tribal) names, and religious names.
In opposition to aliases, in some countries there are legal acts of name changes. In Australia, for instance, this is called "deed poll".
In Germany such name changes happen under exceptional conditions only and are always subject to official recording. The naming system in Germany is quite tightly regulated and you are not supposed to use any other name, except in certain situations where one would expect pseudonymes (e.g., book authors, actors, etc.)
In the U.S., however, name changes seem to be more frequent than in Germany and the naming system is less regulated as in Germany. One issue that one would need to clarify is the meaning of "legal" name. Legal name, obviously, has different meanings in different countries, depending on how the naming system is regulated.
The concept of display name was vague all along. The question is what display? The whole idea of names is that they are "displayed" on paper, computer screens, and in spoken language. The use case of display names thus is not clear. Basically there is no longer a need to have a name type "display name" in our new person name type. This is so, because we no longer distort the natural (or purposeful) ordering of the name parts by requiring name parts to be put in different slots. Name parts occur in some order that is defined or selected by someone, either the holder of that name or the computer system, or the citation style guide, etc.
Some names are used in Licenses or other accreditations and it is quite important to record the name as such. Examples are: school records, graduation certificates, license to practice a profession, etc. Notably, women who had a Doctoral degree were the first ones who assumed double names in Germany many decades ago. The reason was that their dissertations and certifications were issued for their maiden names. Later on, when those women married they would have lost their certifications by switching their family names entirely.
In many cases, keeping a name history is enough. However, the license name type allows one to indicate the reason why a certain name is still kept in the history, i.e., in this case, because it is mentioned in a license or record.
This was a very difficult discussion, where a lot of arguments were exchanged but where people also said they could not even see the issue being so lively discussed.
Let's put this into historical perspective.
In versions 2.1 and 2.2 of HL7 there was no name type code at all, and the only place a "maiden" name was even mentioned was "PID-mother's maiden name". There was obviously no place to specify the patient's maiden name. This seemed to be somehow less of a problem in the U.S., but it was definitely a problem in Germany, which is why HL7 Germany redefined mother's maiden name to patient's maiden name.
Then came the name type code, and with it came the maiden name type code. The meaning of which was clear at that time, since there was just the maiden name and adopted name. It probably was not quite clear what would happen with a female that was adopted at 5 years, had a family name before and switched the family name through adoption and later married and switched the name again. We had a way to express the name she had after adoption, we were able to specify the name befor marriage, which in this case are the same! Two ways to specify the same name, but on the other hand, there was no way to specify neither the name before adoption, nor the name after marriage. Which is pretty odd, but, again, didn't seem to matter very much.
The famous Dutch name change initiative that started with a Sermon by John Baptist in summer 1997's meeting in San Francisco (or was it Tampa?), was the major driving force for bringing in "birth" name and "spouse" name types. As far as I know, the rationale was not to address the oddities mentioned in the last paragraph. Rather, the issue was that "maiden" seemed to imply "female before marriage" or even stronger cultural connotations. Since the people of the Netherlands have long had a very reasonable and free culture, the Dutch did away with those sexist traditions long before the rest of the world even realized the issue.
So the driving force behind "birth" name was to open up the narrow sense of "maiden". In that sense, "birth" was clearly meant to subsume "maiden".
The "spouse" name type on the other hand was meant as kind of the antonyme of "birth". The above examples around Irma Jongeneel are an extensive description of the dutch naming system which essentially explain why "birth" and "spouse" name types are so important in the Netherlands. It is all because a married (or otherwise officially associated) couple of persons (not necessarily of opposite gender), will sort of combine their family names while both names remain as independently useful family names. That's why birth name would get the "birth" classifier and the name of the spouse would get the "spouse" classifier.
From that perspective it seemed like "maiden" was subsumed by "birth", as a way to express the same concept with less sexist connotations.
But this was everything else than agreed to by everyone.
It turned out that the dutch reform has created more different notions than was originally expected. For example, again, what happens if someone changes his/her name before marriage? We finally decided that "maiden" and "birth" should not be merged, in parts, because "maiden name" is a cultural entity that may not exist in the Netherlads but still exist in many computer systems.
We made the observation that the above mentioned name types have different "directions" of meaning in time. They do not so much express what any name part is semantically, since family names are family names, but they try to capture how names come about. Dawid added, that those name types not only capture how names came about, but also, how names ceased to be used.
In the "ancient" U.S. name system of the 1950s and the German name system that losened up only recently the issues were simple. For instance, my wife's name is "Dorothea Schadow" but her maiden name is "Riemer".
If we mention the maiden name of my wife, we indicate that this maiden name, "Riemer", was used for her before she assumed my family name, "Schadow", through marriage. So her current name is "Schadow" and will remain "Schadow" for the unforseeable future. Her family name was "Riemer" but no longer so. Now, it is just her maiden name. Thus, "maiden" name seems not to explain how the name "Riemer" came about, but it tells how the name part "Riemer" ceased to be used.Riemer <---MAIDEN| -----------------------+------------------------------> lifetime |CURRENT---> Schadow
From the perspective of this very traditional naming scheme "maiden" and "current" is all you need to distinguish. And indeed most existing information systems are build based on this traditional misconception. No matter how strongly we may insist in this through our data base design, this is not how the world really works.
Since "maiden" is a term routed in the traditional patriarchal system, we can define "maiden" name as:
A "maiden name" is the surname of a woman before she marries.at lest, this is what Webster's has to say about "maiden name". Clearly, this notion appears archaic today. But still ADT system's data bases, data entry forms and even application logic sometimes is built on this misconception.
Again, the Dutch people are the avant-garde of a more reasonable approach to looking at things. In the dutch naming system the "directions" are different, as Irma's example showed that "maiden" is not an issue here:
In the Dutch system, all name parts point forward. The name types explain how name parts came about, not how they ceased to be used.|BIRTH---> de Haas -----------------------+------------------------------> lifetime |SPOUSE---> Jongeneel
From that perspective, "maiden" and "birth" do have different meanings. In the Dutch system the entire concept of "maiden name" simply does no longer exist. In Germany and the U.S. it still exists.
One could assume that maiden marks a name that ceased to be used, but this position seems to be no consensus. At the most I would open up the concept of "maiden name" to be less sexist so that I would like to see the definition to read as follows:
A maiden name is a name part that a person had immediately before this person's first marriage and that was given up due to that marriage.By "marriage" I understand any kind of "culturally accepted personal association between human beings." This is open enough to include the wildest things as long as they are accepted in that culture (not necessarily accepted in other cultures). This includes homosexual marriages, religous (non-civil) marriages civil (non-religious) mariages; simply anything that causes someone to give up some of his/her name parts.
This is not just semantic talk. Practical connotations to a name part classified as "maiden" would be "don't use it", except in special circumstances or with special prefixes.
What happens if someone get's married and does not change her/his name?
From my perspective this is simple: "maiden name" simply does not apply.
However one can argue the other way: since "maiden" means young unmarried girl, you do have a maiden name even though you might have never gave up your name. Notably every maiden would have just a maiden name. Every unmarried person would have only a maiden name. Here it all depends on whether we think of names as slotted parts or as tagged parts. If name parts are slotted in data fields, the maiden name of a maiden is duplicated:
Pippi LangstrumpfIn our new system, however, we tag names without duplications:(SlottedName :given-name "Pippi" :current-name "Langstrumpf" :maiden-name "Langstrumpf")
Pippi Langstrumpf(LIST (PersonNamePart :value "Pippi" :classifiers (SET given)) (PersonNamePart :value "Langstrumpf" :classifiers (SET family maiden (current))))
What it all boils down to is the following problems:
We gradually assumed the following rationale: birth name is the name you have at birth. Maiden name is the name you have just before your first marriage. An "Adoption name" is a name you have since you have been adopted (Beware of the ambivalence with "adopted name").
The immediate question becomes: what happens when you marry a second time? What if you are adopted after you first married (this can be done in some countries)? For me the question is, how many reasons of name changes do we have to capture? When is it enough to just keep a history of names?
The answer is proably: "it depends". In Some cultures becoming a widow is a reason for a name change. In others you might change names as you give birth to children. You might also change names as you enter a religious community (e.g., as you become a monk, or a pope :-) Do we want to keep track of all this? Probably, it all depends.
For HL7 we have to stick to practical use cases. However, if we design the name data type according to a majority of existing information systems, we would still get stuck with the "first-m.i.-last" name pattern. A lot of the argument about maiden name was due to existing systems that either require a certain input or give a certain output. What should we do?
In general, we can recommend to consider only using the Dutch system, where we have a
The only strong rationale to keep maiden name is because mapping from a traditional slotted name structure to the new name style is difficult. With a "maiden name" you don't actually know whether this name was used already at birth "birth" or came only through "adoption" or "deed poll". There is considerable overlap with the unmarried name classifier and the other classifiers of Axis 2. Consequently we had to relax the notion that axis 2 classifiers need to be mutual exclusive.
We recognized the the term "initials" may have slightly different meanings in an international context. In the Netherlands "initials" are all the first letters of your given names and family names as you choose.
In Holland there is also the concept of voorletters which are the first letters of the given names. In Holland adults are normally recorded only using their voorletters and family names. This is similar to the vancouver citation style that never spells out first names.
However, we confirmed that the term "inital" means first letter (of whatever), regardless of given or family name. The beautiful initials that start a chapter of medieval books are called "initals" too (e.g., the Schwabacher initals). When "initals" is used in the plural form in context of names and signatures, it usually refers to all the initials of given and family names. It is then used as a short form of a signature.
A typical dutch name using only voorletters would be recorded as a person name variant. We would not need to associate initals with spelled-out name parts.
Academic titles and professional credentials are like voorveugsels and noblety titles on axis 4. You can classify academic degrees and professional titles as suffixes or prefixes. This keeps track of the problem that "PhD" and "MD" are suffixes but "Dr." and "Prof. Dr. med. Dr. phil. h.c." are prefixes.
The old HL7 address data types (AD, XAD) regarded an address as a data structure where each component had a special role. For instance, AD distinguished ZIP, city, state, country, street, and other parts of the address.
Over time people discovered more information elements that could be known about an address and added those elements as components to the address data type. Those additional components where county, census tract, etc. Those information items would normally not appear on mailing labels and one would not necessarily ask for them if oue would go visit someone under a given address.
On the other hand it turned out that there are a number of information elements that do appear on mailing labels which are nevertheless rare and therefore remained unclassified. For instance, U.S. military addresses may have a unit designation "UNIT 2050" instead of a street and instead or in addition to a city. The name of a ship (e.g. "U.S.S. Enterprise") can appear instead of a city.
Internationally there are other address parts that may exist in one country but may be unknown in another country. For example, in U.S. addresses one finds directional codes like "N", "S", "W", and "E", which are essential to find a given address in one city. Those direction codes are unknown, for instance, in Germany.
Robin Zimmerman and Joann Larson have compiled an analysis of U.S. and some international addresses based on information of the universal postal union (UPU). This work reinforces the observation that there are so many different kinds of address parts that creating a fixed data structure where every part has its slot is impractical. See also examples of world wide addresses as published by the UPU. There is also an australian standard that defines the pieces an address is made up of.
Another problem with the old address data types was that they ordered the parts of an address by the meaning of that part. The most important use case for address information, however, is printing a mailing label. In order to generate a mailing label it doesn't matter what the emaning of the different parts of an address is, as long as those parts appear at the appropriate place on the label.
The placement of address parts, however, depends on the country. For example, while in U.S. and most European addresses the ZIP code appears somewhere at the end, Japanese ZIP codes are written at the very top. In fact, Japanese addreesses are writen in the reverse direction: from the most general locator tho the specific locations, with the name of the recipient appearing at the end.
Even in addresses of the north western part of the world there are such differences as to how ZIP code and city are placed. In Germany and most European countries, for instance, the ZIP code is placed in front of the city, while in England, the ZIP code appears after the city name on a separate line. In the U.S. the zip code follows the city and usually the state code. In most European countries, special country codes (different from ISO 3166 country codes) are written before the ZIP code (separated from the ZIP code by a dash). In U.S. and England country codes appear at the end. In Great Britain, however, the ZIP appears even after the country designator, whereas in the U.S.A. the country code appears at the very end.
In short, layout and meaning of address parts are independent (orthogonal) issues, but the address data type must take care of both. The focus, however, is not on the meaning of the parts, but on the layout. Although we could define a semantically very fine-grained address part classification, those would be impractiacl to use with a large majority of existing information systems that do not make those fine grained semantic distinctions. There are simply too many different address parts and too many different country-specific variations, that may or may not really correspond.
Thus, focussing primarily on the layout of address labels is a way to establish a greatest common denominator for interoperability. System A might store addresses in 5 lines. System B might distinguish ZIP code, country, state and a street line. System C might distinguish a house-number on the street line (common in Germany or Holland). System B can use system C's addresses and A can use addresses from both B and C.
It is still a problem how system C can find a house number in the street-line or how system B can identify a street-line in a list of lines received from system A. Rather than forcing everyone to make the most fine-grained distinction we require those systems who make the distinctions to deal with the less distinctive addresses.
Postal or Residential Address | |||
---|---|---|---|
This Address data type is used to communicate postal addresses and residential addresses. The main use of such data is to allow printing mail labels (postal address), or to allow a person to physically visit that address (residential address). The difference between postal and residential address is whether or not there is just a post box. The residential address is not supposed to contain other information that might be useful for finding geographic locations or doing epidemiological studies. These addresses are thus not very well suited for describing the locations of mobile visits or the "residency" of homeless people. | |||
component name | type/domain | optionality | description |
purpose | Code Value | optional | A purpose code indicates what a given address is to be used for. Examples are: prefered residency (used primarily for visiting), temporary (visit or mailing, but see History), preferred mailing address (used specifically for mailing), and some more specific ones, such as "birth address" (to track addresses of small children). An address without specific purpose code might be a default address useful for any purpose, but an address with a specific purpose code would be prefered for that respective purpose. |
bad address flag | Boolean | optional | Indicates that an address is not working. Absence of a status means "unknown" status, i.e., that is't presumably a good address. If the flag is set explicitly to false, it means that this address has been proven to work at least once. |
value | LIST OF Address Part | mandatory | This contains the actual address data as a list of address parts that may or may not have semantic tags. |
Address Part | |||
---|---|---|---|
This type is not used outside of the Address data type. Addresses are regarded as a token list. Tokens usually are character strings but may have a tag that signifies the role of the token. Typical parts that exist in about every address are ZIP code, city, country but other roles may be defined regionally, nationally, or on an enterprize level (e.g. in military addresses). Addresses are usually broken up into lines which is indicated by special line break tokens. | |||
component name | type/domain | optionality | description |
value | Character String | mandatory exception: for line break tokens. | The value of an address part is what is printed on a label. |
role | Code Value | optional | The role of an address part (if any) indicate whether an address part is the ZIP code, city, country, post box, etc. |
Short | Long | Meaning |
---|---|---|
R | RES | residency used primarily to visit an address. |
P | PO | postal address used to send mail. |
T | TMP | temporary address visit or mailing, but see History |
B | BRTH | birth address CDC uses those for child immunization. |
... |
Short | Long | Meaning |
---|---|---|
L | LIT | literal this is the default role code |
K | DEL | delimiter stuff, printed without framing whitespace. Line break if no value component provided. |
C | CNT | country |
T | CTY | city (town) |
E | STA | state
("E " as in French état, which should
reconcile the French who have to use "E " for their
"departements") |
Z | ZIP | ZIP code |
H | HNR | house number (aka. "primary street number", however, it is not the number of the street, but the number of the house or lot alongside the street.) |
A | ADL | additional locator can be a unit designator, such as appartment number, suite number, but also floor. There may be several unit designators in an address to cover things like: "3rd floor, Appt. 342". This can also be a designator that points away from the location, rather than specifying a smaller location within some larger one. Example is Dutch "t.o." to mean "opposite to" for house boats. |
S | STR | street name or number |
ST | STT | street type (e.g. street, avenue, road, lane, ...) (probably not useful enough) |
D | DIR | direction (e.g., N, S, W, E) |
P | POB | P.O. Box |
... |
Please note that the person name is not part of our address type even though it is mentioned by UPU and Joann/Robin's list.
A U.S. address
1028 Pinewood Court
Indianapolis, IN 46240
U.S.A.
(Address (LIST (AddressPart :value "1028 Pinewood Court") ; LIT is the default role (AddressPart :role "DEL") ; DEL's value is newline by default (AddressPart :value "Indianapolis" :role "CTY") (AddressPart :value ", " :role "DEL") ; DEL comes w/o extra space (AddressPart :value "IN" :role "STA") (AddressPart :value "46240" :role "ZIP") (AddressPart :role "DEL") ; DEL's value is newline by default (AddressPart :value "U.S.A." :role "CNT")))
A German address
Windsteiner Weg 54A
D-14165 Berlin
(Address (LIST (AddressPart :value "Windsteiner Weg 54A") ; LIT is the default role (AddressPart :role "DEL") ; DEL's value is newline by default (AddressPart :value "D" :role "CNT") (AddressPart :value "-" :role "DEL") ; no whitespace before and after (AddressPart :value "14165" :role "ZIP") (AddressPart :value "Berlin" :role "CTY")))
Address labels contain white space. The white space rules used in typestetting are not trivial. In general two words are separated by white space. An interpuction mark, like a komma or period follows directly to the preceding non-whitespace stuff, but those marks are always followed by whitespace. Dashes are not surrounded by whitespace at all. Note the whitespace rules do not really exist for languages such as Thai or Japanese where white space is basically not used. However, you can always simply ignore whitespace, which is why Thai and Japanese are easier to print. In any case, neither Thai nor Japanese would have whitespace where it was not allowed in Latin script.
The difficult whitespace rules can, for the purpose of the Address data type be broken down into only six precise rules:
White space never accumulates, i.e. two subsequent spaces are the same as one. Subsequent line breaks can be reduced to one. White space around a line break is not significant.
Literals may contain explicit white space, subject to the same white space reduction rules. There is no notion of a literal line break within the text of a single address part.
Leading and trailing explicit whitespace is insignificant in
all address parts, except for delimiter (DEL
) address
parts.
By default an address part is surrounded by implicit white space.
Delimiter (DEL
) address parts are not surrounded
by any implicit white space.
Leading and trailing explicit whitespace is significant in
in delimiter (DEL
) address parts.
This means that all address parts are generally surrounded by white space, but white space does never accumulate. Delimiters are never surrounded by implicit white space and every whitespace contributed by preceeding or succeeding address parts is discarded, whether it was implicit or explicit. For example, all of the following variants
are printed the same way:(lit "1028") (lit "Pinewood Court") (lit "1028 ") (lit "Pinewood Court") (lit "1028") (lit " Pinewood Court") (lit "1028 ") (lit " Pinewood Court") (lit "1028 ") (lit " Pinewood Court")
with only one white space between "1028" and "Pinewood Court"."1028 Pinewood Court"
A DEL address part is a delimiter, and would never be framed by implicit white space. As noted above, a comma is always followed by white space, but this whitespace would have to be part of the value part of the delimiter. HL7 systems do not have to enforce all those typographical rules. For example, all of the following variants
are printed the same way:(lit "Indianapolis") (del ", ") (lit "IN") (lit "Indianapolis ") (del ", ") (lit "IN") (lit "Indianapolis") (del ", ") (lit " IN") (lit "Indianapolis ") (del ", ") (lit " IN")
with no white space before the comma and only one white space after the comma, i.e. the white space that has been provided literally in the delimiter value string. This literal space could have been missing, as in the following cases"Indianapolis, IN"
which are printed all the same way:(lit "Indianapolis") (del ",") (lit "IN") (lit "Indianapolis ") (del ",") (lit "IN") (lit "Indianapolis") (del ",") (lit " IN") (lit "Indianapolis ") (del ",") (lit " IN") (lit "Indianapolis") (del ",") (lit " IN")
without the space after the comma. This is not good typographic style, but it is not enforced by HL7 rules. No space is wanted around dashes, such as in European addresses:"Indianapolis,IN"
which are printed all the same way:(cnt "D") (del "-") (zip "12200") (cty "Berlin") (cnt "D ") (del "-") (zip "12200") (cty "Berlin") (cnt "D ") (del "-") (zip "12200") (cty " Berlin")
"D-12200 Berlin"
The DEL address part does not need any value for a DEL's value is a line break by default. Note that our whitespace rules apply nicely to line breaks, since a line break makes trailing white space of the previous line redundant and leading white space of the subsequent line is correctly removed too.
The following is another U.S. address with maximal tagging of the address parts:
1001 W 10th Street RG5
Indianapolis, IN 46202
U.S.A.
(Address (LIST (AddressPart :value "1001" :role "HNR") (AddressPart :value "W" :role "DIR") (AddressPart :value "10th" :role "STR") (AddressPart :value "Street" :role "STT") (AddressPart :value "RG5" :role "LIT") (AddressPart :role "DEL") (AddressPart :value "Indianapolis" :role "CTY") (AddressPart :value ", " :role "DEL") (AddressPart :value "IN" :role "STA") (AddressPart :value "46202" :role "ZIP") (AddressPart :role "DEL") (AddressPart :value "U.S.A." :role "CNT")))
The instance notation shows how different the new address type is compared with the old HL7 AD/XAD types.
This address type is an interesting construct: It is kind of the inverse of a record data structure. In a record, we have a bunch of slots that may or may not contain data. In this data type we have a bunch of data that may or may not be assigned slots.
It is especially interesting to see how this data type maps into XML. An automatic mapping (as the one used for the HIMSS demo) would create a very long unreadable XML. But the reason for the popularity of XML is that markup can be added gently to a basically "human readable" text. XML wise a much nicer represenation would be:
the contents of this address could now be refined:1001 W 10th Street RG5 Indianapolis, IN 46240 U.S.A.
note that in the above represenation we at least allowed address part roles to occur as XML attributes. If DTDs were not used, one could even create a nicer representation if we turn the role codes into XML tags.1001 W 10th Street RG5 Indianapolis, IN 46240 U.S.A.
1001 W 10th Street RG5 Indianapolis ,IN 46240 U.S.A.
Actually the address data type is an example for the paradigmatic use case of XML: a bunch of data that may or may not be further marked up. It would be very odd if we would not use XML in this classic way for this classic use case.
Should we allow for address part values other than mere Character Strings? Especially, should we allow for code values? Using code values seems to make sense for things like country code and state. Using a code table for state or countries is of course safer and allows to process addresses into groups.
While this is possible in general, we have three problems:
The data type definition and all of the instances would become more complex, since we have to define the AddressPart.value as a type choice between CharacterString and CodeValue (or even ConceptDescriptor!)
While there are codes for U.S. states and countries (e.g., ISO 3166 Country
Code) those codes are not used uniformly. There are two forms to
abbreviate U.S. states, e.g., the Commonwealth of Massachusetts can be
"MA
" or "Mass.
". While the ISO country code
is suggested for international use, there is a long tradition in
Europe to abbreviate countries in a different code (same that is used
for country stickers on cars.) Thus, the ISO code for Germany is
"DE
" but "D
" is used all over Europe.
Since there are different code tables in use one might even require the Concept Descriptor data type to account for the translations. This is a considerable overhead, for what use?
The use case of codes in addresses is very limited. If a receiver really wants to rely on those codes, we set up a number of requirements that did not exist before. (1) the address part must be tagged with an explicit role, (2) the right code must be used by the sender. The use case to code addresses is very localized, which means, the coding of address parts may be needed in one application but it is not needed in many others. In order to print labels and visit people, coded address parts are not essential.
We probably do not whant to make the address data type any more complex than it already is. HL7 should certainly not impose more requirements to code certain address parts. It just seems not to be a widely demanded use case, an a priory argument for coded address parts, which could offset the lack of use cases, seems to not exist.
However, there is one powerful way in which the simpler address data type defined here can meet the needs of those who would like to have coded address fields: type casting.
Through type casting a message would be valid even though the sender put a CodeValue, or ConceptDescriptor in place of a CharacterString. This means, a sender, who does code address parts, is able to send his coded address parts to a peer, who also prefers to receive coded address parts where possible. Thus, an implementation may behave as if the address data type would be defined in a more complex way.
The point is, we don't have to make the HL7 specification more difficult to understand and implement for those who do not want this extra feature of coded address parts and still allow those who want to deal with the extra work to go ahead and do it. This is another example where implicit type casting in a well defined type system proves extremely useful: the canonical specification can remain simple, and still extra requirements can be supported in a compatible way!
We need much less flexibility and power with organization names. We considered what might be to organization names:
Different name parts, such as "Hewlett-Packard" vs. "HP" vs. "Inc.", "Co.", "Ltd.", "B.V.", "AG", "GmbH", etc.
"Marriage" of companies and trading of divisions, thus, UNIX was a trade mark of AT&T, then USL, then Novell, and who knows. "Daimler" and "Crysler" are now "Daimler-Crysler" and "Behring", a manufacturer of vaccines, is known or subsumed by some other name in the U.S.
Anyway, we concluded that noone really keeps track of those things, so all we need is an organization name string and, perhaps, a name type code. HL7 v2.3 had a name type code table for organization names (XON) including:
L | legal |
A | alias |
D | display |
ST | stock exchange |
Display name has no defined use, since names are always displayed and it begs the question "whose display?". I wonder whether anyone in healthcare would want to include the Wall Street ticker symbol or the Indianapolis Star newspaper's abbreviation of a manufacturer of vaccines. But there is no reason why we should restrict this existing "feature" of version 2.3.
All in all this is not a very controversial or important issue. So, unless there is any significant objection we can just stick to a v2.3-like solution.
Organization Name (ON) | |||
---|---|---|---|
A collection of organization name variants. | |||
SET OF Organization Name Variant |
Organization Name Variant | |||
---|---|---|---|
This type is not used outside of the Organization Name data type. Organization Names are regarded as a collection of organization name variants each used in different contexts or for a different purpose. | |||
component name | type/domain | optionality | description |
type | Code Value | optional | A type code indicates what an organization name is to be used for. Examples are: alias, legal, stock-exchange. |
value | Character String | mandatory | This contains the actual name data as a simple character string. |
Next conference call is next Monday, March 22, 1999, 11:00 AM EST.
We are back to our usual time slot, since the international attendees have dropped out as we closed up the PAFM related issues. Thanks for all the help by PAFM and the internationals without whom we would not have solved the issues around Real World Identifier, Person Name, Address, and Organization Name!
Agenda items for next time are:
It seems like the rest is quite uncontroversial (except, perhaps, for open issues and technical religiosity.) But it is a lot of further detail work. We will be unable to complete this if we do not find a way to assign and complete homeworks.
regards
-Gunther Schadow