This is Info file ProtoGen.info, produced by Makeinfo-1.64 from the input file ProtoGen.texi. This text describes the implementation of HL7 that is being done at the Universitätsklinikum Steglitz in Berlin. It is meant as a report about the work in general as well as a manual for the software that is about to be developed. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. Copyright (C) 1994, 1995, 1996 Gunther Schadow  File: ProtoGen.info, Node: Consistency check, Next: The data item numbers, Prev: Errors, Up: The HL7 database Consistency check ================= Being mounted with the insight of the last section, we can now start to check our database. We will have to check for * tuples with undefined keys * whether one relation is included in another * consistency of conceptually redundant data * missing data * syntax errors in message definitions During the rest of this section, we will merely see the output that was generated by the Prolog check clauses: Removing Facts with undefined key value: retracting segment(G198, "") There was a completely void segment definition caused while chapter 1 was processed. Something has screwed up, we can safely ignore this. Now we're checking for tables which had not been defined in appendix A, but were used in the chapters. `chptbl.pl' was loaded and compared against table/3, result is `none', meaning that no undefined tables have been found. Next, we check for one to one relationship between table/2 and table/3, particularly if table/3 contains all of table/2, which is true. The descriptions of corresponding tables are consistent, which means, that we can savely exclude table/2 from the further process without loss of information. Performing checks on: *** Tables chptbl compiled, 0.05 sec, 2,120 bytes. tables which are not yet defined: none tables from table/2 that are not in table/3: none tables with conflicting descriptions: none The next part checks whether there are tables referred to for which there are no values defined. This is tue for a lot of tables but is not necessarily wrong. A table denoted as `user_defined' may well be undefined in the HL7 standard. But how about tables marked as `hl7_standard'? These may be references to tables defined by other standards, however we'll remember this problem in case we get into trouble but leave this alone for now. We also check if these missing tables are defined somewhere in the chapters, but they are in neither case. Finally we check for values which do not belong to a table, which would point us towards an error in our AWK scripts, but this is not the case. tables without any value definition: for class `hl7_and_user': 111,112,101,30,90,63. in the chapters: none for class `hl7_standard': 50,31,13,25,59,51,52,55,56,37,14,26,18, 75,17,29,35,41,15. in the chapters: none for class `user_defined': 117,23,19,120,21,22,32,43,44,45,46,47, 49,114,113,57,66,60,24,64,96,68,69,42, 72,86,73,79,118,81,83,84,10,87,88,89,92, 93,94,115,110,98,99. in the chapters: none values without any defining table: none We do a similar processing of segment as we did with table above, i.e. checking for containment and conflicting descriptions, and do not find any surprise. *** Segments segments from segment/2 that are not in segment/3: none segments with conflicting descriptions: none Here follows the proof of our assumptions we made about the one-to-one relationship of field and data element. The extra check on conflicts in length is of historical reasons and will disapear, since it is done again by the general check for conflicts *** Data elements conflicts in length: for data_element/9: none for field/10: none data_element/9 not in field/10: none data_element/9 referenced by more than on field/10: none field/10 and data_element/9 conflicts: [obx,4,769],[obx,6,562]. We see that our proof for one-to-one relationship succeeds, although there are inconsistencies concerning the data types of two pairs of field and data element, which are marked below: field(obx,4,20,*st*,_,_,_,00769,"observation subid"). data_element(00769,"observation subid",obx,anr,20,*nm*,_,_,_). field(obx,6,20,*st*,_,_,_,00562,"units"). data_element(00562,"units",obx,anr,20,*id*,_,_,_). Next, check is as self explaining as it succeeds well. *** Fields field/10 without segment/3: none segment/3 without field/10: none Finally there are the checks on messages. Message definition syntax is error free because Prolog has filtered out the mismatched parentheses. Nevertheless the problem was not corrected by Prolog, it was just removed. No syntax error does merely mean that the conversion scripts have their job well done. But then there are undefined message types. We don't care about the stuff from the `Widget' unfunctional area. That the messages of appendix C do not appear in the list of message types in appendix A is a fact that makes us worry once more about the `computerized data dictionary' which is said to be used to generate all the tables. Why are there inconsistencies? Why missing objects? In fact some people think that the computerized data dictionary is just a phantom. The ADR message type is unknown as well, but these problems could be fixed easily once we know about them. However, undefined messages are not at all easy to fix, because how should we know about them if we are told nothing but its name and functional area? Note, that ORF and ORM are missing because of the syntax error above. While OCF has disappeared since v2.2, ARD and OSQ are still hanging around in the tables. We assume them to be fired next time as well. Finally -- again forget about the widgets -- there are messages which refer to segments which are not defined. This is the worst error, that we have detected here, because it makes whole messages unimplementable. However, if we subtract the `ms'-typo in an ACK of which there are many other definitions as well, and once we know, that we have to treat `ANY' segments specially, there remains PD1, patient demographics, which is undefined. After all v2.2 has silently fired this segment as much as v2.1 silently contained this segment. *** Messages message definition syntax errors: none undefined message types used: adr,nmd,nmq,nmr,wro,wrp. undefined message: ard,ocf,orf,orm,osq. undefined segments used: [wro,wid],[wrp,wdn],[wrp,wpn],[wrp,wpd], [wrp,wdn],[wrp,wpn],[wrp,wpd],[wrp,wdn],[wrp,wpn],[wrp,wpd], [ack,ms],[adt,pd1],[orr,any].  File: ProtoGen.info, Node: The data item numbers, Next: On the abstractness of abstract syntax in HL7, Prev: Consistency check, Up: The HL7 database The data item numbers ===================== The relations for fields, tables and values have each one domain that is made up from a kind of integer numbers, i.e. strings of a few numbers often beginning with some zeroes. These aren't really numbers because obviously, there is a significance given to leading zeroes. Thus a five digit string denotes a field, a four digit string denotes a table, and a six digit string denotes a value. If we want to regard these digit strings as numbers we have to admit, that these numbers are not a unique classification of data items, but not more than a key to the relation, where otherwise would be no simple key. In fact the only usage of these keys is for the relation of tables, while fields and values have a composite key which is sufficient. In the latter cases, we won't make use of the digit strings. Nevertheless, there is evidence that what was intended with the data item numbers is a kind of classification of data items of HL7. How else would this meaning that was given to the leading zeroes be explicable? It wold have been sufficient just to assign numbers which can not uniquely specify an HL7 data item but which do their job within a single relation. Anyway, we don't need a HL7 data classification here.  File: ProtoGen.info, Node: On the abstractness of abstract syntax in HL7, Next: On trigger events, Prev: The data item numbers, Up: The HL7 database On the abstractness of abstract syntax in HL7 ============================================= The HL7 documentation claims, that it complies to the idea of the OSI reference model with it's seven distinct layers. There are terms like `abstract syntax' vs. `encoding rules' frequently used in the specification. However, as we already stated above (*note A view on HL7::.), these distinctions are not always made as strict as the document claims. Let us see here, why this is so. There are the HL7 encoding rules, which are meant as an interim standard made available until there are implementations of OSI standards. These encoding rules are very simple: any data is represented as a string of displayable ASCII characters with a set of five delimiters defined, which terminate the data items. Since it is thus very unlikely, that any byte of data will interfere with the underlying transport mechanism, it is possible even for the simplest kind of text processor, batch file or serial line to transmit the HL7 messages. However, what seems like an advantage on the very first view, turns out to have a considerable impact on the higher levels of abstraction. These encoding rules impose the restriction to the higher levels of the protocol stack, that they may not send unprintable characters or even may not use the delimiter characters as data. A presentation layer, that forwards it's task up to the higher layers is of pretty little use. It rather should make it's mechanisms and those of the underlying layers transparent to the upper layers. One could argue here, that the HL7 encoding rules were not meant to be perfect, and that better encoding standards that are now available wold replace them. The abstract message definition, as the heart of HL7 would allow this. However, there are parts of the encoding rules, which have taken a place even in the abstract message definition: The MSH segment defines the `encoding characters' as being the first data field of the MSH. This is wrong. It would have been easy to let the negotiation about encoding characters be part of the LLP. These problems notwithstanding, there is a way to overcome at least parts of the problems that the encoding rules impose on the higher levels of HL7: Since any data is converted to a string of printable ASCII characters, this problem is practically of little relevance for such data types, which represent numerics (i.e. NM, DT, TM, TS, SI). However, text data types (i.e. ST, TX, FT) are set directly by the application which should not be forced to worry about printability of ASCII characters. The encoding module must provide at least some transparency here, but HL7 defines no standard for this, even though the solution is so obviously at hand: There is already one encoding character defined, which is called the `escape character'. The usage of escape characters are common among applications as well as communication programs. Escape characters are commonly used for two purposes: 1. Protect data bytes from their misinterpretation as control bytes (e.g. as `\' used in shell programs, or the DEL character used in ANSI X3.28 transparent mode). 2. Mark some sequence of bytes in a stream of data bytes to be interpreted as entities of control (often referred to as a `escape sequence') (e.g. ANSI terminal, TeX, SGML). The HL7 escape character used to mark a sequence of control characters (as pointed out in number 2). Unfortunately, the usage of escape sequences is currently limited to TX and FT types. Escape sequences should, however, be aplicable for any type where data/control ambiguities might arise. Since the delimiter characters are readily available for redefinition this ambiguity might arise in the encoding of any data type. Consider some message that redefines the delimiters to be `+.:-?' instead of `|~^&\': Even the numerical an date/time data types must use the escape sequences to unambiguously encode their values. But there is still a bigger problem: Some features of HL7 extremely corrupt the distinction between Abstract syntax (application layer) and the presentation layer. All these issues are concerned with length of fields or blocks. HL7 drags views which only exist on character streams far inside the abstract level, where we should rather deal with concepts than with strings. The first issue is the definition of maximal lengths of fields which are not of string or text data type. These make illegal assumptions about representations of values. For example the length of a DT value does not belong into the description of a PID(1) segment, since this makes assumptions about how DT values are represented, which highly depends on the encoding rules used. To give a maximum length is not correct for the PN type too. PN type is a composite type which consists of 6 ST types, there is a maximum length defined to be 48 including the delimiter characters. Not only that delimiters should not be part of an abstract syntax, how can this restriction be applied? Two passes are needed for the correct encoding: the first pass had to assemble the PN encoding from the encoding of it's components, the second had to check the whole string that encodes the PN value for an exceeding length. If the length is more than 48, a crucial question arises: Which of the components is to be truncated? It is obvious that such a restriction is not implementable by a reasonable effort since this restriction is of no use at all but seems to exist merely for historical reasons. This might shed a light on the concept of data in earlier days of HL7: any data was obviously regarded as strings even numerics or composites. While we could silently ignore the 48 characters restriction of PN, there are more assumptions being made about lengths which seem inadequate to the author. There is the method of continuation segments proposed in the HL7 standard. This is a feature which again loads burden onto the application that the lower layer protocol should carry. Thus an application would have to bother with the reassembling of continued messages which is extremely cumbersome. Segments are entities of data transmission and as such their integrity should not be touched on the application layer. There is hardly any need for the continuation of a segment if there is a proper lower layer protocol. Lengthy messages should be split into packets and reassembled which should all happen completely transparently to the application layer. ---------- Footnotes ---------- (1) there is an inconsistency in v2.2 which tells that PID field 7 be a `TS' value of length 8  File: ProtoGen.info, Node: On trigger events, Next: On the null value, Prev: On the abstractness of abstract syntax in HL7, Up: The HL7 database On trigger events ================= The trigger events build a link from HL7 transactions to the real world. Circumstances are described, in which a certain transaction is initiated. There is a many to one relation between trigger events and message types. Especially the ADT message is a superset of many messages which are though syntactically similar, distinct in their contents and purpose. There is a classification of trigger events in the table of trigger event codes. However, there are more trigger events than are listed in the table 0003. From the view of the HL7 data base, the concept of event type codes is unfortunately degraded to merely a table of subselectors, which are defined only for those events, for which there is a many to one relation to message types. It is desirable to have a complete list of event codes, so that a message would be uniquely referred to by it's event code. Until then we have to either refer to a message by specifying the message type and the event code (if there is one), or we have to merge the tables of message types and event codes into one such that any message type that has several event codes is removed from the merged table.  File: ProtoGen.info, Node: On the null value, Prev: On trigger events, Up: The HL7 database On the null value ================= HL7 makes a distinction between values that are not present and those that are null. While the meaning of `not present' is obvious (it could be paraphrased to `unknown'), the semantics of the null value remain somewhat confusing, even though the null value is repeatedly mentioned in the HL7 document. The problem here is to get an idea of consistent general meaning of the null value, and the crucial question is: What does a null value mean in a NM field? the HL7 document (v2.2) tells the following (Page 2-5): `The difference [between not present and null] appears when the contents of a message will be used to update a record in a database rather than create a new one. If no value is sent, (i.e., it is omitted) the old value should remain unchanged. If the null value is sent, the old value should be changed to null.' But what does this notation of `null' mean with respect to NM or DT fields? The problem fades away, if we do regard any data transmitted by HL7 as strings. However here we want to provide a mapping from diverse data types to HL7. Should we simply map `""' to `0' then? Even though, this would solve the problems wit NM types, it would cause nonsense values in other fields like TS. There are two ways out of this paradox: * restrict the null value to string like data only * open up the meaning of the null value so, that it can apply to any data type. Here we try to go the second way and assume an object, which is not present as unknown, whereas null indicates that an object is known to be non-existent or that it doesn't make sense in a certain context. This is still consistent to the interpretation of the null value concerning data bases, that is given in the HL7 standard: An unknown value will not cause the data base to be changed, but will rather be bound to what is stored in the data base. It still remains subject to further discussion how the difference between a string of length zero and a null value would have to be represented and interpreted. For example, a null string could be encoded as `\0' (or as `""' while the null value would be `\0'). The string of length null updates a data base field to a string of length null, while a null value updates the data base to a value meaning `value does not exist'. Thus the HL7 protocol would be able to handle nulls as they are proposed for the nested universal relational database model (see LEVENE (1992)) to handle incomplete information. Since incomplete information does concern medical informatics it is strongly recommended here, to go this way of opening up the null value for a general well defined usage in any data type. However, a problem still remains as the scope of the null value is unclear in a composite data type: If a CM field is received as being `""', what does it mean? Does it mean that the first component of the field is null and the other components are not present? Or does it mean that the whole CM data is null? This ambiguity could be solved by further interdicting the deletion of delimiters which terminate trailing items which are not present. Thus `1234^^' must not be truncated to `1234'. However the error tolerance is a feature which is relied on in the extensions of v2.2. Therefore it should be hard to convince the HL7 committee to change this. But other ways can be found (and should be found) to overcome this ambiguity.  File: ProtoGen.info, Node: Generating C++ code, Next: Integration into the system, Prev: The HL7 database, Up: Top Generating C++ code ******************* This chapter of is dedicated to the actual implementation of the HL7 standard, whose specifications was made available to machine processing by the process shown in the preceding chapters. I have already shown some points, which will become relevant in the implementation. This chapter will begin by presenting the general concept that was assumed for the implementation. We will then shed a light on the compiler, that does the job of translating the data base into program code. * Menu: * General concept:: * I/O methods:: * The code generator::  File: ProtoGen.info, Node: General concept, Next: I/O methods, Prev: Generating C++ code, Up: Generating C++ code General concept =============== The object oriented programming technique as provided by C++ allows a very natural view on the HL7 data items. Any data item is an object of HL7 (HL7 object). Individual data types, segments or messages are special objects, which share common properties. We define the properties that are common to all HL7 objects as the class "HL7Object". The class "HL7Object" is an abstract base class which still contains some method and data instances including the flag that shows if an object is present or not. There is a hierarchy of objects as shown in figure 4. Basically there is the HL7Object, which is inherited by any other data object. Then come the basic HL7 types and the set of delimiters, which could be regarded as a special data type as well. These basic types have been implemented manually since the basic types are quite heterogeneous but only a few in all. It doesn't seem appropriate to implement them automatically from some kind of a data base. Moreover, the description of data types in the HL7 standard is presented in a narrative form which is hard to scan for specifying data. The composite objects are however generated from a simple relation that describes their contents. Normally a component is a basic data type (with some exceptions, when a component is itself a composite, see below (*note I/O methods::.)), which reduces the organization in a composite data type to just collect the basic data types. The relation that describes the composites is manually edited. Segments are described in the HL7 standard by tables, which we have brought in the form of a data base. The implementations of segments are generated from this data base. So are the implementations of messages. There is a need for an abstraction of segments to be introduced: the ANY segment. This is a segment, that can be of any type. Since segments are encoded as tagged data, i.e. data that is preceded by some identifying code, we can exactly discriminate the types of segments by the tag (the segment id). This is not possible with data types, since there are ambiguities and no tag. What we just told about the ANY segment applies for the ANY message as well. However, to discriminate messages is more complex than it is with segments, since the information about message types is scattered throughout several segments. There are several qualities which are common to all HL7 objects, we present these during the following subsections. * Menu: * Names:: * States:: * Class and type:: * Repetition and optionality:: * Register:: * Methods::  File: ProtoGen.info, Node: Names, Next: States, Prev: General concept, Up: General concept Names ----- In program generators there is the consideration of how to choose names for the various objects. The higher the complexity of the data structures the more distinct names have to be given to objects. There is always a tradeoff between highly descriptive names, which are long and thus hard to type, and short names which are easier to type but less descriptive. Even though program generators do well without descriptive names, the end user of the libraries which are to be built are human beings, programmers, for which the naming should be convenient. Because C++ allows overloading of functions names can be shorter without the risk of name conflicts. Here we have make the following naming conventions. Classes for data types and segments are given the name of their two or three characters id in upper case, with the small qualifier `typ' or `seg' directly attached to it. Thus `NMtyp' is the class of numerics and `MSHseg' is the class of message header segments. Message classes are named a similar way by attaching `msg' to the uppercase letters which are taken either from the list of message type ids or from the list of event type codes(1). Note that we try to avoid using the underscore character `_', which tends to merely lengthen the name, readability can be acheived by altering uppercase and lowercase letters as well. Instantiations of classes, i.e. variables as well as symbolic names of table values are given a name which is derived from the objects description. There is a simple fuunction which produces valid C names from arbitrary text strings. The function is given three arguments. the text string, the "threshold length" and the "truncate length". All words in the string are concatenated with each first letter in upper case and all other letters in lower case. The words are truncated to the truncate length if they are longer than the threshold length. Typically the threshold length is greater than the truncate length, thus allowing short words to be completed while longer words to be truncated to a short length. For example if the threshold is 5 and the truncate length is 3 a description like "Patient Visit - Additional Info." becomes to `PatVisitAddInfo'. This produces names which are sufficiently unique in most cases. However there are ambiguities in table values, which require a refinement of the name assembling algorithm. These ambiguities occur, when there are single words which begin with a common prefix like the latin preposition "intra" or the greek quantifier "milli". Thus "intravenous" and "intradermal" both become `Int'. This can be overcome by splitting such composite words into two ("intra venous" giving `IntraVen'). ---------- Footnotes ---------- (1) actually we merge these two lists into one  File: ProtoGen.info, Node: States, Next: Class and type, Prev: Names, Up: General concept States ------ Any object according to HL7 must have (at least) the following states: 1. present or not present 2. null or something other than null The status qualities of the object can be seen by the member functions named `is()'. However, modifying a status flag does not always make sense. Thus the `present' quality can only be set, and the `null' quality can only be cleared, by a class which is permitted to modify the components of the object (i.e. the object itself including derived objects). On the other hand, unset() and nullify() which makes an object not present or null respectively, may be called from any context which has an instance of an HL7 object. There are two other status flags, which are not defined in HL7, but are useful as a method to test the integrity of the Object. These qualities are 3. broken 4. zombie A `broken' object is one that was not initialized correctly. Normally a broken condition should end up in an exception, because it is always the result of a programming fault, however exception handling is currently not supported by some C++ compilers. Until then, it could make sense to use the broken bit. The status of a `zombie' means, that an object was destroyed before: either because it was explicitly deleted or that it went off the scope. Such an object is likely to contain damaged values. If an object is accessed via pointers and it happens that the zombie flag is set, there is a programming fault. These events should result in an error exit (with a core dump). Finally there is a bit, which shows whether an object is atomic or a repeated object. This bit logically belongs to the class of the object, which is presented below (*note Repetition and optionality::.). 5. repeat  File: ProtoGen.info, Node: Class and type, Next: Repetition and optionality, Prev: States, Up: General concept Classes and types of objects ---------------------------- An HL7 object may be of one of the following classes, which is not to be confused with a C++ class: `dset' the set of delimiters `message' a HL7 message `group' a group in a HL7 message `segment' a segment `datyp' a data type, either atomic or composite Except for the set of delimiters, any class may be repetitive, which is reflected in the repeat bit in the status byte. An HL7 object of a certain class may be of some type. For example, if the object is a segment, then the type would be MSH or QRR etc. The appropriate values are taken from the table, which defines the message type, segment id, and data types. The information about Class and Type is not as vital as is the status information. In fact, there are object classes which do not have a type (group, dset) or which can not repeat (dset). However, there are cases, when e.g. a segment has to be read, without knowing, in advance, what segment this will be. Therefore, we need a method to identify the type of a segment anyway, thus it seems reasonable to let the object identify itself. Classes and types can be set only by the object's setclass() and settype() members, which are not public. However the public can see the class and type of an object with theclass() and thetype().  File: ProtoGen.info, Node: Repetition and optionality, Next: Register, Prev: Class and type, Up: General concept Repetition and optionality -------------------------- Repetition .......... Repetition is modeled by the class `repeated', which is defined using the template feature and thus can be easily reused for any class of HL7 objects. The `repeated' class is a derived class from the class that is to repeat, enhancing the latter merely by a pointer to the next object which is called `cdr', thus any repeated object is organized as a linked list. This is a more homogeneous approach than using arrays, since there is often no maximal repetition specified. There are member functions which allow the input and output of repeated object as well as an easy to use access to any member of the list, which is syntactically the same as referencing a member of an array. This could be achieved by overloading the array reference operator `[]'. The assignment operator is as well overloaded in order to make a copy of the list rather than just assigning the reference. Optionality ........... Since the HL7 specifications demand a tolerance in respect to missing or unexpected objects, we do not care much about whether a segment is marked as required in the specifications. It is rather up to the application to reject incomplete messages, while unexpected objects tend to be just ignored.  File: ProtoGen.info, Node: Register, Next: Methods, Prev: Repetition and optionality, Up: General concept Register -------- Since most C++ compilers impose a minimal size for a class, there may be some bytes left, which can be used by derived classes or are wasted otherwise. Sometimes, a derived class just needs one bit of information, which if stored individually in the space of the derived object, would require another word of minimal alignment width. In order to save memory space, the derived classes are invited to use the `register' for the storage of their flags. sreg(), creg() and greg() give arbitrary access to the flags. To prevent conflicts along a derivation chain, the usage of the register must be documented in a register allocation table.  File: ProtoGen.info, Node: Methods, Prev: Register, Up: General concept Methods ------- Any object has at least the following methods: 1. Two constructors, one with an empty parameter list and one with all components, which are accessible by the public. The former sets the object to non existent, the latter sets the object to existent and initializes it with the given parameters. The types of the parameters are the same as the types of the components. It is optional, to provide a constructor which takes C base types as parameters and performs the conversion. 2. A destructor, which tends to be an empty function. However, dynamically allocated memory is freed from here. 3. Selector(s) or extractor(s) one for each component of public relevance. Extractors are canonically named `get', where is the name of the component that the extractor will access. The component is not returned as the value of the function but is assigned to a reference parameter. Selectors return an integer value reflecting the success of the extraction. A return Value of zero means that the extraction was performed successfully. Values less than zero mean that the object itself can not perform the extraction (often because it is not present or is null). A Value greater than zero tells that the returned object is unhealthy in any way. 4. Modifier(s) one for each component of public relevance. Modifiers always set an object to be present. nullify() is a special modifier. 5. Input method, returns 1 if reading was successful, 0 otherwise. 6. Output method. Input and output methods can handle different encoding rules depending on some properties of the stream. See *Note I/O methods:: for more.  File: ProtoGen.info, Node: I/O methods, Next: The code generator, Prev: General concept, Up: Generating C++ code I/O methods =========== The encoded segments finally will appear on a stream as well as the decoding will process data, that arrives on a stream. The stream may be bound to a RS232 line, a TCP/IP line, a batch file, a pipe or whatever medium is required. There is an option to choose, whether the output methods are to produce HL7 encoding rule compliant data or human readable data, that is used in order to debug the protocol. This human readable data will be output in a LISP like style, since LISP provides a simple notation for complex objects. The style can however be modified on compile time. The mechanisms that C++ provides with the `iostream' library are very powerful and elegant. However we have to extend the iostream object by some variables, which allow us to set several states for the streams. There are currently the following states that a stream can assume: `hl7er' The stream transports HL7 encoding rules `debug' The stream transports human readable data `level' A small integer, which tells about the delimiters currently to be used. `hl7er' and `debug' are boolean values which are of course mutually exclusive. They are however not represented in a single bit because there may be more coding modes to be integrated into the system. The `level' is normally set to 0, but there are cases, when the level of the stream must be increased. Consider a composite data type CX, which does refer in one of its components to an other composite data type CY: The components of CY thus become the subcomponents of CX. However, the i/o methods of CY need to know if CY is regarded as a composite field or as a composite component. If the latter is true, the level is increased by one and CY uses the subcomponent delimiter to terminate its components. However, it becomes clear that CY may not have in turn a composite component, since there is no sub-sub-component delimiter. In fact in v2.1 the subcomponent feature is never used. Rather such composites are flattened terminating each component by a `^' (for an example see the definition of the CN data type in chapter 2).  File: ProtoGen.info, Node: The code generator, Prev: I/O methods, Up: Generating C++ code The code generator ================== The code generator is a program currently written in Prolog which produces C++ code from the HL7 data base. This compiler is currently under development and is still tentative. At this moment it generates code for composite data types, segments and tables. The last step of generating C++ classes for messages is not finished yet. The Prolog program was first created in a monolithic approach and has then been split up into modules, which are far easier to maintain than the monolith. There are tables scattered throughout the modules, which enhance the data base which we have created by detailed information about many different things including composite types, required methods and their implementation etc. In order to keep the complexity of the compiler as little as possible, the macro features of the C preprocessors are used extensively. The distinction made between abstract handling of objects (by macros) and concrete implementation (by the macro definitions) help to ease porting of the code to different platforms. * Menu: * Tables::  File: ProtoGen.info, Node: Tables, Prev: The code generator, Up: The code generator Tables ------ Tables are not regarded as HL7 objects as was described above (*note General concept::.), since tables are not exchanged during transactions. Rather tables provide means to interpret ID data fields correctly. They are merely classes with an enum type which is a mapping from symbolic names to integers, and a static array, which provides the mapping from the integers to character strings. Finally some useful member functions are provided which look up the item number of an ID in the table of character strings or translate such a number back to an ID. Even though these table objects are simple, it seems reasonable to enhance them to interface a data base management system for bigger tables as are classifications and other coding systems. Some identifiers used by HL7 are listed in a HL7 table. Notably the message type identifiers. But there are others, such as segment and data type identifiers, which are not listed in such a table. It would be of some use for implementations to have these tables. Rather than keeping distict tables for the implementation of the protocol and the application using the protocol, there should be only one set of tables which are uniformly used by both. Thus we will generate entries in the `Table' and `Table Value' relation which list segment and type identifiers. To avoid conflicts with table numbers we will count down from the highest possible number (i.e. 9999 for tables and 999999 for values).  File: ProtoGen.info, Node: Integration into the system, Next: Bibliography, Prev: Generating C++ code, Up: Top Integration into the system *************************** The HL7 implementation as we have built it so far consists merely of a library which handles encoding and decoding. However, means have to be provided, by which these functions are controlled. In general, there are many possibilities how to do this. Here we only give a short list of what will be included into the system: * a batch file interpreter * a module that handles incoming unsolicited messages, i.e. a program that is invoced by the inter net daemon (inetd(8)) which listens at a certain TCP/IP port for incoming packets * a module that does the same job on RS-232 lines by listening for incoming messages * a module that helps connecting a HL7 server via TCP/IP or RS-232 All these modules will use some kind of lower layer protocol as described in the HL7 v2.1 document. While on TCP/IP links the minimal lower layer protocol would be sufficient, the X3.28 based data link protocol will have to be implemented to support RS-232 connections if SLIP or PPP is not used.  File: ProtoGen.info, Node: Bibliography, Prev: Integration into the system, Up: Top Bibliography ************ KUPERMAN (1991) Gilead J. Kuperman, Reed M. Gardener, T. Allan Pryor: `HELP: A dynamic hospital information system', Springer-Verlag, 1991. ROSE (1990) Marshall T. Rose: `The open book: A practical perspective on OSI', Prentice-Hall, 1990. LEVENE (1992) Mark Levene: `The nested universal relation database model'; Vol 595, G. Goos and J. Hartmanns (Eds.): `Lecture notes in computer science', Springer-Verlag, 1992.