org.regenstrief.xhl7
Class HL7XMLReader

java.lang.Object
  extended byorg.regenstrief.xhl7.HL7XMLReader
All Implemented Interfaces:
HL7XMLLiterate, org.xml.sax.XMLReader

public class HL7XMLReader
extends java.lang.Object
implements org.xml.sax.XMLReader, HL7XMLLiterate

This is a class that parses HL7 messages and generates a very simple XML format. This is NOT the same format as the HL7 v2xml specification for a number of reasons. The most important reason being that the v2xml specification format cannot be generated from HL7 instances alone without much help from the specification (e.g., which data type a certain field is, and it requires groups to be identified as elements on their own.)

    <hl7>
      <segment tag="MSH">
        <field>SIMPLE CONTENT</field>
        <field>FIRST COMPONENT<component>SECOND COMPONENT</component>
          <component>THIRD COMPONENT</component>
        </field>
        <field>FIRST REPETITION<repeat>SECOND REPETITION</repeat>
          <repeat>THIRD REPETITION</repeat>
        </field>
        <field>FIRST COMPONENT FIRST ITEM<component>SECOND COMPNENT</component>
           <repeat>...</repeat>
        <field>...<component>...<subcomponent>...</subcomponent>
              <subcomponent></subcomponent>
            </component>
         </repeat>
       </field>
     </segment>
   </hl7>
   

This is what I call "lazy structure", i.e., structural tags are only used at the point where they are really needed. It is easy to use in XSLT and XPath. The first node in a tag is the first field content. The next node, if any, is a structural tag that will tell you on what structural level the first text node was. Since HL7 has no mixed content models, there is never any ambiguity.

This follows the lazy spirit of HL7 v2. It does so not because the author believes that that is a good way of thinking and handling information, but because the real world is just that messy and after having done all a person can do to produce a structure-anal HL7 parser (ProtoGen/HL7) the author has given up any hope that HL7 v2.x use will ever get there.

This class behaves like an XML SAX parser, i.e., upon reading an HL7 message it generates SAX events. It is extremely simple and extremely easy to use with standard XML tools in Java. One can simply run the HL7 message through an XSLT transform. And this is really the main purpose of this class: to open up the HL7 v2.x message of any uglyness into the world of powerful XSLT transforms. This can be used to drive message processors or just message transformers that end up emitting the result of the transformation in HL7 v2 syntax.

Note also that there is no guarrantee the result is actually an HL7 message. It could be a batch or a continuation of a preceeding message. That's why the toplevel element isn't called "message" but simply "hl7".

You can invoke this in various ways according to the TRAX specification, as this class implements the SAX XMLReader interface. I recommend using Saxon v7 and higher as follows:

$ saxon7 -x HL7XMLReader some.hl7 transform.xsl


Nested Class Summary
static class HL7XMLReader.Tokenizer
          A buffering string tokenizer with ample look-ahead.
 
Field Summary
 
Fields inherited from interface org.regenstrief.xhl7.HL7XMLLiterate
ATT_DEL_COMPONENT, ATT_DEL_ESCAPE, ATT_DEL_FIELD, ATT_DEL_REPEAT, ATT_DEL_SUBCOMPONENT, CDATA, DEFAULT_DELIMITERS, DELIMITER_ESCAPES, N_DEL_COMPONENT, N_DEL_ESCAPE, N_DEL_FIELD, N_DEL_REPEAT, N_DEL_SUBCOMPONENT, NAMESPACE_URI, NUMBER_OF_DELIMITERS, TAG_COMPONENT, TAG_ESCAPE, TAG_FIELD, TAG_REPEAT, TAG_ROOT, TAG_SUBCOMPONENT
 
Constructor Summary
HL7XMLReader()
           
 
Method Summary
 org.xml.sax.ContentHandler getContentHandler()
          Returns the content handler currently set.
 org.xml.sax.DTDHandler getDTDHandler()
          A no-op.
 org.xml.sax.EntityResolver getEntityResolver()
          A no-op.
 org.xml.sax.ErrorHandler getErrorHandler()
          A no-op at this time.
 boolean getFeature(java.lang.String name)
          Echoes back features that had been set earlier (or their defaults.)
 java.lang.Object getProperty(java.lang.String name)
          A no-op at this time.
static void main(java.lang.String[] args)
          Test utility, turn an HL7 message into XML.
 void parse(org.xml.sax.InputSource input)
          Parse an HL7 message from the given InputSource.
 void parse(java.lang.String url)
          Parse an HL7 message from a URL.
 void setContentHandler(org.xml.sax.ContentHandler contentHandler)
          Sets content handler that will receive the next event that we emit (and all events after that until another content handler is set).
 void setDTDHandler(org.xml.sax.DTDHandler x)
          A no-op.
 void setEntityResolver(org.xml.sax.EntityResolver x)
          A no-op.
 void setErrorHandler(org.xml.sax.ErrorHandler ErrorHandler)
          A no-op at this time.
 void setFeature(java.lang.String name, boolean value)
          Sets a feature.
 void setProperty(java.lang.String name, java.lang.Object value)
          A no-op at this time.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HL7XMLReader

public HL7XMLReader()
Method Detail

getContentHandler

public org.xml.sax.ContentHandler getContentHandler()
Returns the content handler currently set.

Specified by:
getContentHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

setContentHandler

public void setContentHandler(org.xml.sax.ContentHandler contentHandler)
Sets content handler that will receive the next event that we emit (and all events after that until another content handler is set).

Specified by:
setContentHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

parse

public void parse(org.xml.sax.InputSource input)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Parse an HL7 message from the given InputSource.

Specified by:
parse in interface org.xml.sax.XMLReader
Throws:
java.io.IOException
org.xml.sax.SAXException
See Also:
javax.xml.sax.XMLReader

parse

public void parse(java.lang.String url)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Parse an HL7 message from a URL.

Specified by:
parse in interface org.xml.sax.XMLReader
Throws:
java.io.IOException
org.xml.sax.SAXException
See Also:
javax.xml.sax.XMLReader

getFeature

public boolean getFeature(java.lang.String name)
                   throws org.xml.sax.SAXNotRecognizedException,
                          org.xml.sax.SAXNotSupportedException
Echoes back features that had been set earlier (or their defaults.)

Specified by:
getFeature in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException
See Also:
for what is supported., javax.xml.sax.XMLReader

setFeature

public void setFeature(java.lang.String name,
                       boolean value)
                throws org.xml.sax.SAXNotRecognizedException,
                       org.xml.sax.SAXNotSupportedException
Sets a feature.

The following feature requests are tolerated without error, but silently ignored and not echoed back with get- feature.

Specified by:
setFeature in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException
See Also:
javax.xml.sax.XMLReader

getProperty

public java.lang.Object getProperty(java.lang.String name)
                             throws org.xml.sax.SAXNotRecognizedException,
                                    org.xml.sax.SAXNotSupportedException
A no-op at this time.

Specified by:
getProperty in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException
See Also:
javax.xml.sax.XMLReader

setProperty

public void setProperty(java.lang.String name,
                        java.lang.Object value)
                 throws org.xml.sax.SAXNotRecognizedException,
                        org.xml.sax.SAXNotSupportedException
A no-op at this time.

Specified by:
setProperty in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException
See Also:
javax.xml.sax.XMLReader

getErrorHandler

public org.xml.sax.ErrorHandler getErrorHandler()
A no-op at this time.

Specified by:
getErrorHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

setErrorHandler

public void setErrorHandler(org.xml.sax.ErrorHandler ErrorHandler)
A no-op at this time.

Specified by:
setErrorHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

getDTDHandler

public org.xml.sax.DTDHandler getDTDHandler()
A no-op. This is an irrelevant issue for HL7 parsing.

Specified by:
getDTDHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

getEntityResolver

public org.xml.sax.EntityResolver getEntityResolver()
A no-op. This is an irrelevant issue for HL7 parsing.

Specified by:
getEntityResolver in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

setDTDHandler

public void setDTDHandler(org.xml.sax.DTDHandler x)
A no-op. This is an irrelevant issue for HL7 parsing.

Specified by:
setDTDHandler in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

setEntityResolver

public void setEntityResolver(org.xml.sax.EntityResolver x)
A no-op. This is an irrelevant issue for HL7 parsing.

Specified by:
setEntityResolver in interface org.xml.sax.XMLReader
See Also:
javax.xml.sax.XMLReader

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Test utility, turn an HL7 message into XML. This is deliberately kept absolutely simple, stupid, and powerless. A dump of the XML only. If you want to actually transform the XML, hook this XMLReader into your favorite XSLT engine. E.g., with Saxon you give the option: -x HL7XMLReader.

Throws:
java.lang.Exception