HL7 v3.0 Data Types
Implementable Technology Specification (ITS)
Extensible Markup Language (XML)

Gunther Schadow
Regenstrief Institute for Health Care

1 Introduction

Name vs. Type

We consistently use XML tag names for the names of components not for their types. For example, a Patient object that has a component named dateOfBirth of type PointInTime (PT), we would use the name dateOfBirth as the XML tag. We use variable names or component names for XML tags, not types. Thus PointInTime would not occur as an XML tag.

Element vs. Attribute

XML allows data to appear as content of an element (i.e. data between a start tag and an end tag) or as attributes (i.e. given as part of the start tag). XML attributes can only contain unstructured data, i.e. character strings and character string literals (e.g. for numbers or point in time values). XML elements can contain a character string literal, tagged component elements, or both.

This ITS does not distinguish between XML elements and attributes. All components of HL7 data types can occur as attributes or elements. The name of the attribute is the same as the name of the element and defined with the component. One component of each data type can appear as the element's content. This content-component usually is the required (i.e. non-optional) "main" information of the data type, which is often named "value," or "data." If all other components are optional ("?", "*", or "#IMPLIED"), the content-component may occur in an attribute.

Implicit typing

An XML element represents a value with a name. The name (e.g., dateOfBirth) is represented by the XML tag name. The value is found as attributes or content data or child elements of the XML element. HL7 specifies the data type to be expected for each value. If the given value conforms to the element's expected type, a message need not contain any explicit type information.

The actual type of an element may be different than the expected type if there is a conversion rule from the given type to the expected type. If the given value is of an unexpected type, that type must be explicitly specified. The type of an element can be specified using the TY attribute.

The TY attribute determines all the features of a value, i.e. all the expected attributes child-elements and content data and their representation are determined by the TY attribute of an element.

Why don't we name that attribute "T"?

Anonymous Values

The name of XML elements always represents the name of a variable or component. However, sometimes values are not assigned to names but may appear anonymously. Anonymous values usually occur as elements of collections, such as sets or lists. Anonymous values are supplied using the A element.

<NAME> <A TY="CS">Gunther</A> <A TY="CS">Schadow</A> </NAME>

How the XML ITS is specified

Document Type Definitions (DTD) where once the suggested way to define SGML and XML documents. However, the DTD language has severe deficiencies, especially after the XML "simplification".

DTDs have essentially no notion of the difference between data names and data types.
XML invented the ideosyncratic difference between "attributes" and "elements." Both attributes and elements serve essentially the same need, i.e. to label values. But each does this in a different way. The distinction between attributes and elements (once motivated by the idea of a manin document text with sporadic annotational information) provides no value, instead it forces the designer to engage in unreasonable decisions and tradeoffs.
DTDs allow the specification of syntax, but the XML syntax is totally free of semantics (not even the simplest macro expansion facility is provided.)
Using DTDs (with XML instead of SGML) requires adherence to a fixed relative positioning of elements. This is not only unnecessary, it jeopardizes one of the key benefits of a tagged representation over a positional representation (such as the traditional HL7 encoding rules.)
DTD language does not allow to specify constraints.

For the above reasons, DTDs are not suited as the basis of the HL7 XML data type definitions. A similar valuation of DTDs has led the W3C to nearly abandon the maintenance and development of DTDs, instead the W3C has switched to RDF as the language for further defining the XML language.

This ITS specification will define data type representations using tables that show the representational components of HL7 data types for XML and associate with each component certain properties. These properties define how each component is to be represented in XML, e.g., whether it can appear as an attribute, as an element, or as a literal character data of an attribute or an element.

This section outlines algorithms for bothe building and analyzing XML expressions of HL7 data values. The meaning of the data type tables, though explained in plain english, are eventually defined in the prototype algorithms.

The following is a prototype table showing all the columns of the data type definition tables and all possible property values.

Data Type Name (Abbreviation)
Name XML name O C TY A Notes

long name tag name R C type A

long name tag name O type a

long name tag name D X type

long name tag name X type

long short RODX C X ty Aa notes

*Data Type Name* (*Abbreviation*)
Name	XML name	O	C	TY	A	Notes
long name	tag name	R	C	type	A
long name	tag name	O		type	a
long name	tag name	D	X	type
long name	tag name	X		type
long	short	RODX	C X	ty	Aa	notes

Each row in the table defines one component of the XML representation of HL7 data types. The HL7 Data Type Specification uses similar tables to help defining the data types. However, the components shown in the ITS independent specification are semantic components not representational components. By and large, each semantic component of a data type has a corresponding representational component in the ITS. However, sometimes a semantic component requires more than one representational component in an ITS (e.g., "encoding" of binary data.) Other times, a semantic component is conveyed implicitly as part of a representational component corresponding for another semantic component (e.g., precision of floating point values).

The columns of the table mean the following:

Name

The long name of the component. Usually the name used in the ITS independent HL7 Data Type Specification. Otherwise the component is defined in this ITS specification.

XML name

The name used for the XML representation. Since we do not distinguish between XML attributes and elements, we use the same name for tags and attributes.

O

Occurence (e.g. optional/required.) The values used are:

value meaning

R required, component must be present but, if appropriate, can be assigned a no-information (NULL) value.

O optional, component need not be present, in which case it assumes a default value or a no-information value.

D component must be explicitly provided if no default is defined at the point of use of the data type.

X special occurence rule applies as specified in the notes column.

C

value	meaning
R	required, component must be present but, if appropriate, can be assigned a no-information (NULL) value.
O	optional, component need not be present, in which case it assumes a default value or a no-information value.
D	component must be explicitly provided if no default is defined at the point of use of the data type.
X	special occurence rule applies as specified in the notes column.

This column specifies whether the component can appear as a character data content of a value element. Usually one and only one component has the C property. Also usually, it is a fairly essential and required component that has the C property.

value meaning

C component can appear as character data content of an element.

component can not appear as character data content of an element.

X a special rule applies as specified in the notes column.

TY

value	meaning
C	component can appear as character data content of an element.
	component can not appear as character data content of an element.
X	a special rule applies as specified in the notes column.

The default type of the column. This is the expected type and the default for the TY attribute of the component if none is present. If the type provided is different than this expected type and is not a string literal for the expected type, an explicit TY attribute is required.

A

Specifies whether the component can appear as an attribute. In general, all components can appear as an attribute (little-a property). In addition, one component, usually the one having the C property, can appear in an attribute of an enclosing element (big-A property). For example, if type T is defined as having components c₁ and c₂ and if c₁ has the big-A property and if BAR is a component of FOO defined as of type T then

<FOO BAR="v₁"/>

can be used instead of

<FOO> <BARc₁="v₁"/> </FOO>

The A-property values are

value meaning

A big-A: component can be used stand-alone for the entire data type, i.e. the data type can be used for an attribute value giving only this component's value.

a little-A: component can be represented as an attribute of an element representing a value of this data type.

Notes

value	meaning
A	big-A: component can be used stand-alone for the entire data type, i.e. the data type can be used for an attribute value giving only this component's value.
a	little-A: component can be represented as an attribute of an element representing a value of this data type.

This column contains important useage notes and constraints.

Boolean

A simple boolean value is specified as an entity reference using the entity reference "&true;" for true and "&false;" for false.

To what do those entities expand? Can we reserve special literals starting with "#" (i.e., #T and #F)?

No Information

A simple no information value is assumed where an XML element or attribute is not provided.

Alternatively an explicit no information value can be provided using the entity reference "&null;". Different flavors of no-information will be defined as "&null.flavor;".

How? Can we reserve special literals starting with "#" (e.g., #NULL, #UNK, #IINF)?

2 Text

2.1 Character String

The Character String is the only "datatype" that XML fully supports. XML character data meets all the requirements for HL7 Character Strings.

XML character data can appear as attribute data or as the content data of XML elements. Character data of XML elements can contain other elements (mixed content, PCDATA). Character data of XML attributes can be subject to further constraints.

The characters less than ("<"), ampersand ("&") double quote (""") and apostrophe ("'") have special meaning in XML and can oftentimes not appear literally as character data. Therefore these characters should always be rewritten as "<," "&," ""," and "'" respectively.

XML character strings can contain end-of-line tokens. This is consistent with the definition of HL7 character strings. However, in XML, end-of-line tokens are always normalized to ASCII line-feed (LF) characters, while in HL7 we did not specify such behavior. Thus, XML character strings do not make any difference between a carriage-return (CR), the sequence CR+LF and the simple LF character.

2.2 Binary Data

XML only supports character data, not binary data. Thus, binary data must be encoded in characters. Many character encodings exist, such as hexadecimal digits (4 bit/char), uu-encoding (6 bit/char), base64 encoding (5 ¹/₃ bit/char), or base 85 encoding (6.4 bit/char). Base64 encoding has become pretty popular as it is used with MIME e-mail. Base64 is not necessarily the best encoding as it is a little less dense as, e.g., the uu-code. But there is no strong point in supporting multiple encodings, which is why currently only base64 is allowed for true binary data.

Since the BIN data type is used primarily in the free text (FTX) data type, another encoding is defined for the encoding of text data. Both encodings are specified normatively in the following subsections.

Binary Data (BIN)
Name XML name O C TY A Notes

encoding ENC O CV a mandatory table

data DATA R C ST A

Binary Data (BIN)
Name	XML name	O	C	TY	A	Notes
encoding	ENC	O		CV	a	mandatory table
data	DATA	R	C	ST	A

Code for Encoding of Binary Data
Code D Meaning

BASE64 D Base 64 encoding of 16 bits in three characters.

TEXT Text encoding (used only for actual character data.)

Code for Encoding of Binary Data
Code	D	Meaning
BASE64	D	Base 64 encoding of 16 bits in three characters.
TEXT		Text encoding (used only for actual character data.)

Thus the following examples are legal

<FOO> YOLckIwZua6MMVtjuGFNdyw7r+9h6W2kt+pFl7SR7KTtwnyJSkCIaflI84L6P 7SKVHX2zUftEduysr98BUEsBAhQAFAAAAAgA3JJ0JnduMIMHTgAAAEABABEAA AAAAAGNoNG91dGxpbmUuZDQuZG9jUEsFBA== </FOO> <FOO ENC="base64"> YOLckIwZua6MMVtjuGFNdyw7r+9h6W2kt+pFl7SR7KTtwnyJSkCIaflI84L6P 7SKVHX2zUftEduysr98BUEsBAhQAFAAAAAgA3JJ0JnduMIMHTgAAAEABABEAA AAAAAGNoNG91dGxpbmUuZDQuZG9jUEsFBA== </FOO> <FOO ENC="TEXT"> This is text data encoded in the default encoding used for this message. Most likely it is encoded in UTF-8. Text data is converted back to bytes according to a transformation specified elsewhere. </FOO>

Base-64 encoding

The following definition is by and large a verbatim copy of the Internet standard RFC 2045. Modifications include removing text that speaks about MIME, e-mail, or SMTP, which do not affect the specification. However, the original terms of RFC 2045 have been loosened up to avoid overspecification. Notably

Base64 data does not need to be broken into lines at all (or can be broken at line length different from 76 characters per line.)
Characters not in the base64 alphabet are completely ignored and not taken as indication of error.
Padding is recommended but not required. Correct padding is always assumed by the receiver at the end of a base64 data block.
No assumption about line breaks of the originally encoded data are made, specifically the suggestion to convert to the Internet-canonical CR+LF line breaks has been dropped.

The Base64 Content-Transfer-Encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. The encoding and decoding algorithms are simple, but the encoded data are consistently only about 33 percent larger than the unencoded data.

A 65-character subset of US-ASCII is used, enabling 6 bits to be represented per printable character. (The extra 65th character, "=", is used to signify a special processing function.)

The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right, a 24-bit input group is formed by concatenating 3 8bit input groups. These 24 bits are then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. When encoding a bit stream via the base64 encoding, the bit stream must be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first 8bit byte, and the eighth bit will be the low-order bit in the first 8bit byte, and so on.

Each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the output string. These characters, identified in Table 1, below, are selected so as to be universally representable.

Table 1: The Base64 Alphabet
Value Encoding Value Encoding Value Encoding Value Encoding

0 A 17 R 34 i 51 z

1 B 18 S 35 j 52 0

2 C 19 T 36 k 53 1

3 D 20 U 37 l 54 2

4 E 21 V 38 m 55 3

5 F 22 W 39 n 56 4

6 G 23 X 40 o 57 5

7 H 24 Y 41 p 58 6

8 I 25 Z 42 q 59 7

9 J 26 a 43 r 60 8

10 K 27 b 44 s 61 9

11 L 28 c 45 t 62 +

12 M 29 d 46 u 63 /

13 N 30 e 47 v

14 O 31 f 48 w pad =

15 P 32 g 49 x

16 Q 33 h 50 y

Table 1: The Base64 Alphabet
Value	Encoding	Value	Encoding	Value	Encoding	Value	Encoding
0	A	17	R	34	i	51	z
1	B	18	S	35	j	52	0
2	C	19	T	36	k	53	1
3	D	20	U	37	l	54	2
4	E	21	V	38	m	55	3
5	F	22	W	39	n	56	4
6	G	23	X	40	o	57	5
7	H	24	Y	41	p	58	6
8	I	25	Z	42	q	59	7
9	J	26	a	43	r	60	8
10	K	27	b	44	s	61	9
11	L	28	c	45	t	62	+
12	M	29	d	46	u	63	/
13	N	30	e	47	v
14	O	31	f	48	w	pad	=
15	P	32	g	49	x
16	Q	33	h	50	y

Any characters outside of the base64 alphabet are to be ignored in base64-encoded data. The encoded output stream can be represented in several lines, where about 76 characters per line is an adviseable. All line breaks or other characters not found in Table 1 must be ignored by decoding software.

Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full encoding quantum is always completed at the end of a body. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Padding at the end of the data is performed using the "=" character. Since all base64 input is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character.

Because it is used only for padding at the end of the data, the occurrence of any "=" characters may be taken as evidence that the end of the data has been reached (without truncation in transit). If the data block ends prematurely without padding, the receiver should go ahead assuming the necessary padding characters anyway. Thus, padding is not strictly necessary.

Text encoding

Text encoding is used only when the data for a binary data block should be sent unobscured in the XML message. Text encoded data is not inert to XML specific transformation. Notably, text encoding is subject to the rules of character encoding, white space handling, and end of line rewriting.

Text encoded data is handled by the receiver as follows.

Transform bytes to characters according to the character encoding selected for the enclosing XML entity (e.g., the enclosing HL7 message).
Transform or discard white-space according to XML rules.
Transform end-of-line sequences to LF according to XML rules.
Resolve all XML entity references.
Transform characters to bytes according to the rules of the selected character encoding. This can be the same as the original encoding, but it can also be another encoding selected by other means.
Characters that do not exist in a particular character set or character encoding result in undefined behaviour (e.g. characters might be silently dropped or replaced.)

For this reason, text encoding for binary data should be used only if the distortions entailed by the above-mentioned transformations will not affect the meaning or usefulness of the data.

2.3 Multimedia Enabled Free Text

Multimedia Enabled Free Text (FTX) extends BIN.
Inherited features are shown in a gray background.
Name XML name C O TY A Notes

media descriptor MEDIA O CV a mandatory table

compression COMP O CV a mandatory table

charset CHARSET O ST a mandatory table

encoding ENC O CV a mandatory table default depends on MEDIA

data DATA R C ST A

Multimedia Enabled Free Text (FTX) extends BIN.
Inherited features are shown in a gray background.
Name	XML name	C	O	TY	A	Notes
media descriptor	MEDIA	O		CV	a	mandatory table
compression	COMP	O		CV	a	mandatory table
charset	CHARSET	O		ST	a	mandatory table
encoding	ENC	O		CV	a	mandatory table default depends on MEDIA
data	DATA	R	C	ST	A

Code for Media Descriptor see IANA for full list.
Code D ENC

text/plain D TEXT

text/x-hl7-ft BASE64

application/pdf BASE64

text/html BASE64

text/sgml BASE64

text/xml BASE64

text/rtf BASE64

audio/basic BASE64

audio/k32adpcm BASE64

image/png BASE64

image/gif BASE64

image/jpeg BASE64

image/g3fax BASE64

image/tiff BASE64

image/x-DICOM BASE64

video/mpeg BASE64

model/vrml BASE64

Code for Media Descriptor see IANA for full list.
Code	D	ENC
`text/plain`	D	TEXT
`text/x-hl7-ft`		BASE64
`application/pdf`		BASE64
`text/html`		BASE64
`text/sgml`		BASE64
`text/xml`		BASE64
`text/rtf`		BASE64
`audio/basic`		BASE64
`audio/k32adpcm`		BASE64
`image/png`		BASE64
`image/gif`		BASE64
`image/jpeg`		BASE64
`image/g3fax`		BASE64
`image/tiff`		BASE64
`image/x-DICOM`		BASE64
`video/mpeg`		BASE64
`model/vrml`		BASE64

Code for Compression of Free Text Data, default is no compression at all.
Code D Meaning

GZIP gzip (deflate) algorithm

Code for Compression of Free Text Data, default is no compression at all.
Code	D	Meaning
GZIP		gzip (deflate) algorithm

Code for Character Set and Character Encoding see IANA for full list. Default is the character encoding used as the encoding of the enclosing XML document.
Code D Meaning

UTF-8 Unicode UTF-8 (backwards compatible to US-ASCII)

US-ASCII 7 bit US ASCII (ANSI X3.4)

UTF-7 Unicode UTF-7 (almost backwards compatible to US-ASCII)

UTF-16 Unicode in 16 bit per character encoding (subject to byte order problems)

ISO-10646-UCS-2 Unicode in 16 bit per character encoding (subject to byte order problems)

ISO-10646-UCS-4 Unicode in 32 bit per character encoding (subject to byte order problems)

ISO-8859-1 ISO Latin-1 (Western European)

ISO-2022-JP Japanese character encoding

Shift_JIS Japanese character encoding

EUC-JP Japanese character encoding

Code for Character Set and Character Encoding see IANA for full list. Default is the character encoding used as the encoding of the enclosing XML document.
Code	D	Meaning
UTF-8		Unicode UTF-8 (backwards compatible to US-ASCII)
US-ASCII		7 bit US ASCII (ANSI X3.4)
UTF-7		Unicode UTF-7 (almost backwards compatible to US-ASCII)
UTF-16		Unicode in 16 bit per character encoding (subject to byte order problems)
ISO-10646-UCS-2		Unicode in 16 bit per character encoding (subject to byte order problems)
ISO-10646-UCS-4		Unicode in 32 bit per character encoding (subject to byte order problems)
ISO-8859-1		ISO Latin-1 (Western European)
ISO-2022-JP		Japanese character encoding
Shift_JIS		Japanese character encoding
EUC-JP		Japanese character encoding

Things, Concepts and Qualities

3.1 Code Value

The Code Value is defined as having many components, yet the most important component is the "value" component, which is a character string.

The code system is often specified mandatory or as a default with the attribute or data type component declared as a Code Value. The code system version is often implicit as being some recent version and the version is not very important anyway, since versions of code systems are supposed to be largely backwards-compatible.

Finally, the print name is by its definition redundant information.

In other words, a code value can often be represented by a mere character string. While an implementation of HL7 data types should fill in the appropriate components of the code value, they don't need to be sent automatically, nor do we have to bother implying them by XML specific (and DTD dependent) means, such as #FIXED attributes. Thus, a code value can often be sent as a simple flat XML attribute instead of requiring an XML element with substructures. This not only saves bandwidth (quite dramatically as we shall see) it also adds to clarity of the message.

If more of the code value's components need to be specified, the code value can be sent as an empty element with only attributes. Expanding the XML attributes to elements is possible, but rarely necessary.

Code Value (CV)
Name XML name O C TY A Notes

value VAL R C ST A

code system SYS D ST a must be explicitly provided when there is no default code system specified for some element

code system version VER O ST a

print name PNM O X ST a Content data is assigned to the print name instead of the value if VAL is provided explicitly.

replacement REPL X X ST a Only if value is %null; or %null.other;. Content data is assigned to the replacement instead of the value if VAL is set explicitly to %null; or %null.other;

Code Value (CV)
Name	XML name	O	C	TY	A	Notes
value	VAL	R	C	ST	A
code system	SYS	D		ST	a	must be explicitly provided when there is no default code system specified for some element
code system version	VER	O		ST	a
print name	PNM	O	X	ST	a	Content data is assigned to the print name instead of the value if VAL is provided explicitly.
replacement	REPL	X	X	ST	a	Only if value is %null; or %null.other;. Content data is assigned to the replacement instead of the value if VAL is set explicitly to %null; or %null.other;

Examples:

The print name and even the replacement text can appear in the content position, if the value VAL is specified as an attribute (and explicitely as NULL respectively). Thus, the following expressions are allowed:

<SEX VAL="M">MALE</SEX> <SEX VAL="%null.other;" SYS="HL7-0001">adrenogenital syndrome</SEX> This notation is very straight forward and reflects exactly what we would expect when desiging a custom document format (compare the use of the mention element in the PRA.) Note that this requires a slightly more complex logic at the receiver's end. The receiver would

fold all attributes into elements (this is the general rule);
see if an explicit VAL element exists, if so, use it;
if character content exists
1. if VAL is explicitly set to #OTHER, use content as replacement;
2. if VAL is set to something other than #OTHER use content as PNM.

It is up to us to decide whether we want to trade the slightly more complex logic for the possibility of a nicer look and feel of a message. However, even if the nice look and feel is possible, it requires additional logic at the sender's side to actually use it.

3.2 Real World Concepts and the Concept Descriptor

Unlike the ITS independent specification we present the XML representation in the reverse order, i.e. bottom up. Note that the purpose of all the types in this subsection (Code Phrase, Code Translation) is to serve the definition of the Concept Descriptor.

3.2.1 Code Phrase

A code phrase is simply a collection of Code Values. Thus Code Phrase does not need any special XML definition.

However, often all the code values in a code phrase may come from the same code system and sending the code system over and over again for each CV is a waste. Therefore the code phrase can assign a default coding system which is then applied to all code values found in the code phrase. "Default coding system" means that code values without explicitly specified code system will "inherit" this default code system. Individual code values may still override the default code system.

The third row in the table is explained below.

Code Phrase (CDPH) extends SET of CV
Name XML name O C TY A Notes

code system SYS O ST a allows to set a default code system for all the code values

code system version VER O ST a allows to set a default code system version for all the code values

values (VAL) 1..* E CV* (a*) can only appear as attribute if default code system is set, the XML tag name will never be used for an element. See text.

Code Phrase (CDPH) extends SET of CV
Name	XML name	O	C	TY	A	Notes
code system	SYS	O		ST	a	allows to set a default code system for all the code values
code system version	VER	O		ST	a	allows to set a default code system version for all the code values
values	(VAL)	1..*	E	CV*	(a*)	can only appear as attribute if default code system is set, the XML tag name will never be used for an element. See text.

The following is the same code phrase in various forms. As can be seen, the amount of tag verbage can be significantly reduced according to the rules explained in the introduction.

The code values of the code phrase can be contracted into one VAL attrribute. In this case, VAL contains white space delimited code value literals (XML NMTOKENS). This reduces the above example even more:

Beware! There is a potential flaw in the NMTOKEN hack: the individual code value may contain white space. There will be no white space in code values in 99% of the cases, however, we can not be totally sure. Should we not allow this nice short form, only because of those unknown 1%?

3.2.2 Code Translation

The code translation extends code phrase, which means it inherits all the features of code phrase and adds four more attributes. The reference to the origin of translation is done through XML ID/IDREF attributes. Note that IDs must be unique within one message but need not be unique accross messages (IDREF can not point outside of a message.)

Code Translation (CDXL) extends CDPH
Name XML name O C TY A Notes

producer PROD O TII a

quality QALY O FPN a

origin ORG R XML:IDREF a! must be an attribute

referable identifier ID R XML:ID a! must be an attribute

Code Translation (CDXL) extends CDPH
Name	XML name	O	TY	A	Notes
producer	PROD	O	TII	a
quality	QALY	O	FPN	a
origin	ORG	R	XML:IDREF	a!	must be an attribute
referable identifier	ID	R	XML:ID	a!	must be an attribute

We do not show examples of code translations since code translations never occur outside of a concept descriptor, whose representation is specified next.

3.2.3 Concept Descriptor

Concept Descriptor (CD) extends set of CDXL
Name XML name O C TY A Notes

original text TEXT O C FTX a

translations 0..* E CDXL* may only occur as an attribute if only one translation is given. and if default code system is also set

Concept Descriptor (CD) extends set of CDXL
Name	XML name	O	C	TY	A	Notes
original text	TEXT	O	C	FTX	a
translations		0..*	E	CDXL*		may only occur as an attribute if only one translation is given. and if default code system is also set

The following examles show the same CD value in various forms, from the most verbous to the most terse form.

If you had only one hair color code, you could send

Note that none of this makes use of any type conversion rule except for converting a string (VAL) to a code value. If you never have original text and you always have only one code, you can send a CV for a CD as follows:

3.3 Technocal Instances

3.3.1 ISO Object Identifier

The Object Identifier (OID) is a simple string literal in the canonical number-dot-number form. No white space is permitted and no characters besides "0" (zero), "1" ... "9", and "." (period).

Examples

3.3.2 Technical Instance Identifier

Technical Instance Identifier (TII)
Name XML name O C TY A Notes

root ROOT R C OID A

extension EXT O ST a

Technical Instance Identifier (TII)
Name	XML name	O	C	TY	A	Notes
root	ROOT	R	C	OID	A
extension	EXT	O		ST	a

Examples:

As an alternative we could define a string literal for TIIs which would thus fit into only one XML attribute:

Note that this creates a problem when the extension (EXT) itself contains an at sign ("@"). But it is nice and handy, so may be, it is worth the little additional trouble?

Many TIIs will occur as elements of sets of TIIs. This will look as follows:

3.3.3 Technical Instance Locator

The ITS independent definition of the Technical Instance Locator needs revision to (1) specify the table of protocol codes (to be IETF-URL + extensions, and (2) be more specific about the phone number and what happened to TN (and why it happened).

Technical Instance Locator (TIL)
Name XML name O C TY A Notes

protocol PROTO R CV a

address ADDR R C ST a

Technical Instance Locator (TIL)
Name	XML name	O	C	TY	A	Notes
protocol	PROTO	R		CV	a
address	ADDR	R	C	ST	a

Examples:

3.4 Real World Instances

All names and identifiers for real world instance are partly RIM classes and partly data types. This ITS specifies only those parts that are data types. The RIM classes are handled by the generic MET/MEI mechanism specified elsewhere.

The Real World Instance Identifier is only a RIM class and thus will not be handled here.

3.4.1 Person Name

The person name is simply a list of person name parts, so no special XML specification is necessary for the Person Name (PN).

Every person name part has a character string value and a set of coded classifiers. Again, we will rely on character string to code value conversion with the mandatory default classifier code. Just as we did with the code phrase, we will make use of XML's "NMTOKENS" attribute type that allows us to enumerate all classifiers in one XML attribute.

Person Name Part (PNXP) extends SET of CV
Name XML name O C TY A Notes

value VAL R C ST a

classifiers (C) O E CV* a* mandatory table. The XML tag name (C) will never be used in an element.

Person Name Part (PNXP) extends SET of CV
Name	XML name	O	C	TY	A	Notes
value	VAL	R	C	ST	a
classifiers	(C)	O	E	CV*	a*	mandatory table. The XML tag name (C) will never be used in an element.

Examples

3.4.2 Postal and Residential Address

Address Part (ADXP)
Name XML name O C TY A Notes

value VAL R C ST a

role R O E CV a mandatory table.

Address Part (ADXP)
Name	XML name	O	C	TY	A	Notes
value	VAL	R	C	ST	a
role	R	O	E	CV	a	mandatory table.

Address (AD) extends LIST of ADXP
Name XML name O C TY A Notes

value VAL R C ST a

purpose PUR O CV a mandatory table.

bad address flag BAD O BL a default is %null; or %null.unknown;

Address (AD) extends LIST of ADXP
Name	XML name	O	C	TY	A	Notes
value	VAL	R	C	ST	a
purpose	PUR	O		CV	a	mandatory table.
bad address flag	BAD	O		BL	a	default is %null; or %null.unknown;

The first example is shows an expanded regular form (not even fully expanded), the second example is much more concise using the various means of folding elements into XML attributes, character data content and through type casting between ST and CV.

<Address TY="AD"> <PUR TY="CV" VAL="RES" PNM="residence"/> <BAD TY="BL">&false;</BAD> <A TY="ADXP">1028<R TY="CV" VAL="HNR"/></A> <A TY="ADXP">Pinewood Ct<R TY="CV" VAL="STR"/></A> <A TY="ADXP"><R TY="CV" VAL="DEL"/></A> <A TY="ADXP">Indianapolis<R TY="CV" VAL="CTY"/></A> <A TY="ADXP">, <R TY="CV" VAL="DEL"/></A> <A TY="ADXP">IN<R TY="CV" VAL="STA"/></A> <A TY="ADXP">-<R TY="CV" VAL="DEL"/></A> <A TY="ADXP">46240<R TY="CV" VAL="ZIP"/></A> </Address> <Address PUR="RES" BAD="&false;"> <A R="HNR">1028</A> <A R="STR">Pinewood Ct</A> <A R="DEL"/> <A R="CTY">Indianapolis</A> <A R="DEL">,</A> <A R="STA">IN</A> <A R="DEL">-</A> <A R="ZIP">46240</A> </Address>

3.4.4 Organization Name

An Organization Name (ON) is simply a set of Organization Name Variants (ONXV). No special XML specification is needed.

Organization Name Variant (ONXV)
Name XML name O C TY A Notes

value VAL R C ST a

type TYPE O E CV a mandatory table.

Organization Name Variant (ONXV)
Name	XML name	O	C	TY	A	Notes
value	VAL	R	C	ST	a
type	TYPE	O	E	CV	a	mandatory table.

The example, again, shows the full blown form first followed by the maximally reduced form:

<OrgName TY="ON"> <A TY="ONXV"> <VAL TY="ST">Franklin Templeton Growth Fund, Inc.</VAL> <TYPE TY="CV" VAL="L" PNM="legal"/> </A> <A TY="ONXV"> <VAL TY="ST">Templeton Growth Fund</VAL> <TYPE TY="CV" VAL="A" PNM="alias"/> </A> <A TY="ONXV"> <VAL TY="ST">TGF</VAL> <TYPE TY="CV" VAL="D" PNM="display"/> </A> <A TY="ONXV"> <VAL TY="ST">TEPLX</VAL> <TYPE TY="CV" VAL="ST" PNM="stock exchange symbol"/> </A> </OrgName> <OrgName TY="ON"> <ONXV TYPE="L">Franklin Templeton Growth Fund, Inc.</ONXV> <ONXV TYPE="A">Templeton Growth Fund</ONXV> <ONXV TYPE="D">TGF</ONXV> <ONXV TYPE="ST">TEPLX</ONXV> </OrgName>

4 Quantities

4.1 Integer

Integer numbers (INT) are represented as decimal digit character strings.

Example:

The positive integer infinity (Aleph₀) is representable as &int.inf;, the negative integer infinity is represented as &int.ninf;.

4.2 Floating Point Number

Floating point numbers (FPN) are represented as decimal digit character strings, optionally with "E exponent" suffix, according to the ITS independent specification.

The precision is determined according to the rules of significant digits.

Example:

The positive real infinity (Aleph₁) is representable as &fpn.inf;, the negative integer infinity is represented as &fpn.ninf;.

4.3 Physical Quantity

Physical Quantity (PQ)
Name XML name O C TY A Notes

value VAL R X FPN a may appear as data content only if the UNIT attribute is given.

unit UNIT O CD a Default code system is The Unified Code for Units of Measures (UCUM) in its case insensitive form. Default is "1" (the unity).

Physical Quantity (PQ)
Name	XML name	O	C	TY	A	Notes
value	VAL	R	X	FPN	a	may appear as data content only if the UNIT attribute is given.
unit	UNIT	O		CD	a	Default code system is The Unified Code for Units of Measures (UCUM) in its case insensitive form. Default is "1" (the unity).

If the default code system is used, the CD can be replaced by a CV or even a character string. Note that in subsequent revisions CD may turn into CV and UCUM be required as mandatory code system. Here we will always assume that UCUM is used.

In addition, a character string literal form is defined for the entire PQ data type. A PQ character string consists of an FPN character string followed by whitespace, and a unit expression according to the UCUM specification. For example "8 HR" means eight hours, "35 S" means 35 seconds, "2.5 DYN.S/CM5" is 2.5 dyne seconds per centimeter to the 5, and "15 /MIN" means fifteen per minute. Some whitespace must be present between number and unit.

Examples, as usual in the order of increasing conciseness.

4.4 Monetary Amount

Monetary Amount (MO) has a similar representation as Physical Quantity.

Monetary Amount (MO)
Name XML name O C TY A Notes

value VAL R X FPN a may appear as data content only if the UNIT attribute is given.

currency unit UNIT R CD a Default code system is ISO 4217. There is no default unit value, i.e. the currency unit must be specified.

Monetary Amount (MO)
Name	XML name	O	C	TY	A	Notes
value	VAL	R	X	FPN	a	may appear as data content only if the UNIT attribute is given.
currency unit	UNIT	R		CD	a	Default code system is ISO 4217. There is no default unit value, i.e. the currency unit must be specified.

If the default code system is used, the CD can be replaced by a CV or even a character string. Note that in subsequent revisions CD may turn into CV and ISO 4217 be required as mandatory code system. Here we will always assume that ISO 4217 is used.

In addition, a character string literal form is defined for the entire MO data type. A MO character string consists of an FPN character string followed by whitespace, and a currency unit symbol according to ISO 4217. For example "50 USD" means fifty U.S. Dollar, "85 DEM" means 85 Deutsche Mark, "240 FRF" is 250 French Francs. Presumably "55 EUR" will mean fiftyfife Euro. Some whitespace must be present between number and unit.

Examples, as usual in the order of increasing conciseness.

4.5 Point in Time

The data type for Point in Time (TS) is represented as a character string according to the HL7 v2.3 standard. Although the ITS independent specification discusses issues of adopting ISO 8601, we will stick to the old HL7 form since there was nothing wrong with it.

The example shows my birth date and time, precise to the hour (I don't know the minute) in the Middle European Time zone (UTC+1)

4.6 Calendar Modulus

To be specified according to the ITS independent specification, which is not ready yet.

5 Orthogonal Issues

5.1 Interval

Interval (INV) of type T
Name XML name O C TY A Notes

low LOW O T a

low closed LCL O Boolean aF may be folded into the LOW element as an attribute OPEN or CLOSE

high HIGH O T a

low closed LCL O I Boolean aF may be folded into the HIGH element as an attribute OPEN or CLOSE

Interval (INV) of type T
Name	XML name	O	C	TY	A	Notes
low	LOW	O		T	a
low closed	LCL	O		Boolean	aF	may be folded into the LOW element as an attribute OPEN or CLOSE
high	HIGH	O		T	a
low closed	LCL	O	I	Boolean	aF	may be folded into the HIGH element as an attribute OPEN or CLOSE

Example:

5.2 General Annotation

Annotation is one of those generic data types that merely add some additional data to any other data type. We will save a level of nesting by implementing this in a way that only adds an element to any other data type.

Annotation (ANT) extends ANY
Name XML name O C TY A Notes

note NOTE R FTX a

Annotation (ANT) extends *ANY*
Name	XML name	O	C	TY	A	Notes
note	NOTE	R		FTX	a

Example:

5.3 History Item and History

History (HIST) is simply a SET of History Item. Note that the ITS independent spec. should be changed to relax LIST to SET.

History item (HXIT) is a possible extension of any other type by adding an element that captures the validiy interval.

History Item (HXIT)
Name XML name O C TY A Notes

validity period ASOF R IVL a

History Item (HXIT)
Name	XML name	O	C	TY	A	Notes
validity period	ASOF	R		IVL	a

Example:

5.4 Uncertainty

To be defined.