Medical terminology version control discussion paper: The chocolate teapot (Version 2.5)

Summary

Modern medical terminologies i.e. those attempting to fulfill the Cimino Desiderata unlike classifications

It is widely assumed that terminologies are thereby immune from semantic discontinuity across releases. They are not. In the absence of NOS and NEC, static 'known unknowns' become mobile 'unknown unknowns'. Terminologies suffer the same semantic drift that classifications suffer when items are either introduced, removed or hierarchy reorganisation occurs. Successive releases of a terminology may incrementally improve content. However changes can mean that interpretation of recorded data requires reference to the terminology as it was when the concept was chosen NOT as it is 'now'.

Unlike terminologies, classifications are usually composed of 'disjoint' entities.

The presence of non-disjoint concepts and the lack of NOS and NEC in terminologies confounds machine data interpretation even within a single version of the terminology.

The structure and attendent coding disciplines of classifications are advantageous where data reuse can take priority over expressivity. Terminologies and classifications can be linked with 'options' to be reconciled by an intelligent process of abstraction. Terminologies and classifications have distinct applications and neither can replace the other.

Introduction

In this paper I present twin theses that "terminology versions matter" and "classifications are different from terminology and remain necessary". For my examples I use non-medical concepts. Not all readers will be confident they know what (say) asthma really 'is'. Indeed a single physician can start a fight in an otherwise empty room over "what is asthma?". In contrast most people will think they know what a teapot is. Until they read this article.

I introduce here two fantastic artifacts; one a classification, the other a terminology. The terminology content changes over 5 releases. Please consider as the terminology evolves how this might affect the following:

This paper closes with some closer to life examples.

The chocolate teapot

Here is a segment of a naïve but functional crockery classification which exists solely to document the appearance of table settings as described in Victorian literature

Classification of Tableware Version 9 (1875 revision)

Crockery
---Teapot
------Brown teapot
------White teapot
------Blue teapot
------Teapot, color not elsewhere classified
------Teapot, color not otherwise specified

Users are obliged to 'code to the leaf'. Only brown, white and blue crockery were fashionable in 1875. The uncouth might deploy other colors ('NEC') or might not even care ('NOS'). However it is both unthinkable and unstated that your teapot could be made of anything other than porcelain.


Elsewhere there is the frequently updated Systematized Nomenclature of Kitchens (SNoKitch) intended to support all catering system and food retail applications. A fragment of this terminology is followed through a number of iterations:

SNoKitch Release n

Crockery
---Teapot

Teapot has no children and could be equivalent to any of the five classification leaves. Concept requests ensue.

SNoKitch Release n+1

Crockery
---Teapot
----- Brown teapot
------White teapot

Anyone wanting to specify a white or brown teapot using SNoKitch can now do so. Coding to leaf is not mandatory and NOS and NEC are absent. The 'Teapot' concept in the terminology terminology is the equivalent of both the classification 'Teapot, color (NEC)', 'Teapot, color (NOS)'. In use, the meaning of the 'Teapot' concept has been skewed. It is now more likely to be used to encode a teapot which is neither brown nor white. All of us can still understand the broad meaning of the parent 'Teapot' concept (and understand its descendents as valid subclasses).

SNoKitch Release n+2

Crockery
---Teapot
----- Brown teapot
------White teapot
------Blue teapot
------China teapot

In release n+2 blue teapot (forgotten in Version n+1) has been added as a child. This alters the meaning of the concept 'Teapot' in its role as 'Teapot, color NEC' manqué.

Meanwhile the newly added 'China teapot' child is the first not to be a discrete sibling i.e. it could be any color. So 'teapot' now has non-disjoint subclasses.

'China teapot' has maybe acquired part of the meaning of 'Teapot, color (NEC)' from 'Teapot' i.e. China teapots which are not colored brown, white, blue or pink.

When people interested in whether a teapot is made of china receive the 'white teapot' code they are out of luck. All they can infer is that it represents a teapot.

As both the material and color matter to us we request new concepts 'white china teapot' and 'blue china teapot' for the next release. (We don't have brown ones. Their china counterparts are neither requested nor created.) However we can never rely on systems elsewhere using them because they don't have to code to leaf. Another system's users may say 'I only know / care / have resources to determine whether teapots are china. We don't capture color here'. Alternatively they simply may not update their picklists.

However a common understanding of the meaning of the parent 'Teapot' concept as superclass of its children persists. Concepts added in the next release shall overturn this understanding.

SNoKitch Release n+3

Crockery
---Teapot
----- Brown teapot
------White teapot
----------White china teapot
------Blue teapot
----------Blue china teapot
------China teapot
---------White china teapot
---------Blue china teapot
------Pink teapot
------Aluminium teapot
------Chocolate teapot
------Ornamental teapot
------Industrial teapot

Expressivity now greatly surpasses that of the Classification of Tableware Version 9. However ontological continuity with previous terminology versions has ruptured. Consider what we can understand from teapots previously coded in releases n, n+1 and n+2.

Can aluminium teapots be any colour other than metallic grey? Chocolate teapots can be white or brown but probably not blue i.e. some siblings are mutually exclusive and others are not. Industrial teapot tells me where it is used but nothing else except that I assume it will never be ornamental or chocolate.

NOS and NEC, with respect to color, material and utility, are now in Brownian motion between releases.

Worse, the arrival of certain children has altered the meaning we can safely infer from the unadorned 'Teapot' parent concept.

The Kansas Tea Company operate decision support software designed against release n+2:

If Concept == (teapot or child of teapot)
Safe to add tea leaves plus boiling water
End If

They had not anticipated confectionary in this hierarchy. As for 'ornamental teapots', who knows if they can be used to make tea?

Meanwhile Tea Shop Transiente had made requests to have various 'Tea making related findings' in Release n+3. These presently reside in a non-contiguous branch of the hierarchy due to (arguably) authoring error.

Tea making related findings
---Teapot related findings
------Large teapot
------Full teapot
------Empty teapot
------Teapot with warm water
------Teapot with cold water

This gives Tea Shop Transiente the pick list they want (at least until the concepts under "Teapot related findings" become dispersed in subsequent releases). Before SNoKitch Release n+3 we were sure of capturing all teapot related records with the codes for 'Teapot' and descendents.

Now the Kansas Tea Company must locate multiple nodes within their decision support rule and be able to exclude subtypes e.g. it would not be wise to add tea leaves and boiling water to either "Full teapot" or "Chocolate teapot".

People may dispute these 'is a' relationships. However their validity depends on your interest in teapots. There is no universal understanding the concept. There was merely an assumption about the term in the earlier versions of the terminology which is now exposed as not applying in all contexts of use. If we had started with a clear definition of teapot (as a 'free text' scope note) this might have been avoided.

What about a logical definition? The questionable subclasses might anyway have been entered as primitives.

Try the exercise of suggesting attributes needed to logically define a 'true' teapot (here taken to mean a pot that you might make tea in). Pitfalls to avoid include erroneous subsumption of tea caddies, tea cups, tea strainers etc. Then use the same ontology to define a cheese knife distinct from fish or bread knives. (SNoKitch is the reference terminology for all catering concepts). Having solved these puzzles move on from a trivial set of simple tangible objects to disorders, disease related events, clinical findings and observations, operative procedures etc.

Meanwhile the SNoKitch terminology maintenance organisation is under pressure to add 400 further teapot concepts to satisfy the demanding Oriental market including "Rough surfaced, egg shaped teapot used in Guangdong Province rice farm tea ceremony, usually no more than half-filled with tea always served cold".

SNoKitch Release n+4

For this release the Elbonian Tea Service had requested 'heat resistant chocolate teapot'. The old concept "Chocolate teapot" is retired as {may be a} "Chocolate teapot" {may be a} "Heat resistant chocolate teapot" and another error in waiting is born.

The issue is

Chocolate teapot
---Heat resistant chocolate teapot

"Having a single subclass of a class usually points to a problem in modeling." The simultaneous creation of the concept "Heat sensitive chocolate teapot" is ideally mandatory. However a general purpose ontology cannot fully enumerate all possible children at lower levels of hierarchies. Neither can attributes be so defined that 'primitives' need never be created and post-coordinated concepts be consistently interpreted.

SNoKitch Release n+5

Most people can find terms which express the teapot that is in their head most of the time (except discriminating Indian-subcontinent tea experts). This is achieved at the expense of predictable machine readability even within different modules of the same application.

The Comestibles Supply Consortium recognises the incoherent use of the SNoKitch terminology across their systems and mandates use of a teapot subset.

Teapot
------Brown teapot
------White teapot
------Blue teapot

All within the Consortium will now understand what was recorded [but only within records from their organisation] and satisfy their concern to produce billable returns using Classification of Tableware Version 9. They still lack the NOS and NEC distinction and engage professional coders to abstract and map records manually.


Real Life?

Untouchable concepts?

Leprosy
----Dapsone resistant leprosy
----Tuberculoid leprosy
----Leprosy neuropathy
----Etc

Note the non-mutually exclusive concepts: the patient could have either one, two or all three disorders. There is also not (yet) a way to specify dapsone sensitive leprosy. Consider the impact of the subsequent arrival of the latter concept.

The concept that wasn't there

Appendectomy
---Non-emergency appendectomy
---Laparoscopic appendectomy
------Laparoscopic emergency appendectomy
---Emergency appendectomy
------Laparoscopic emergency appendectomy
--Etc

Nowhere is an 'open' (non-laparoscopic) appendectomy specified. People will have been happily coding them using the appendectomy concepts unqualified by approach as they obviously mean 'open'. Except they don't!

What will happen when someone adds 'Open appendectomy'?

The conceptual discontinuum

asthma
---acute asthma
---asthma attack
---intrinsic asthma
---extrinsic asthma
---etc

while non-contiguously...

asthma finding
---mild asthma
---moderate asthma
---severe asthma
---asthma currently active
---asthma currently dormant

Discussion

Classifications such as the ICD family include items like "Chronic airway obstruction, not elsewhere classified" or "Other specified Excision of adrenal gland". A problem with such constructs is they are not stable in meaning across versions of the classification i.e. what is classified elsewhere changes with addition or removal of other content.

Another artifact within classifications are "unspecified foo" also known as "foo not otherwise specified" items e.g. this section of the UK's OPCS 4.2 classification

B22 | Excision of adrenal gland
--B221 | Bilateral adrenalectomy and transposition of adrenal tissue
--B222 | Bilateral adrenalectomy nec
--B223 | Unilateral adrenalectomy
--B224 | Partial adrenalectomy
--B228 | Other specified Excision of adrenal gland
--B229 | Unspecified Excision of adrenal gland

"Unspecified Excision of adrenal gland" allows the coder to distinguish the condition where the record says no more than an "Excision of adrenal gland" occurred. This is distinct from the situation where the record did supply more information but the other classification items were inadequate to capture it i.e. the not elsewhere classified scenario.

The rules of use of classifications include the mandate to code to leaf i.e. B22 | Excision of adrenal gland cannot validly be used to code a record.

Note also that classifications are designed to be both exhaustive (however vague they may be) and that items are mutually exclusive (disjoint) in use.

In contrast to classifications such as ICD-9, modern medical terminologies (i.e. those attempting to fulfill the Cimino Desiderata) typically

The widespread assumption is that this gives them immunity from semantic discontinuity across releases. It does not. In the absence of NOS and NEC static 'known unknowns' become 'unknown unknowns' mobile between releases. The situation is further exacerbated by terminologies being typically neither exhaustive (at any one 'level' of siblings) nor disjoint. This means that NOS and NEC will not be unambiguously identified even within a record encoded in a single version of the terminology where the coding use cases differ i.e. the terminology can be viewed as the union of multiple parallel and incomplete classifications.

As terminologies evolve, applications involving data reuse (messaging, billing, clinical audit, active decision support etc) may need to recognise the version of the terminology a code is drawn from.

Successive versions may incrementally improve content but the alterations create circumstances where interpretation of recorded data requires reference to the ontology as it was when the concept was chosen NOT as it is 'now'.

A concept within an terminology is not entire unto itself. Concept use and interpretation are altered by addition, retirement and movements of other concepts. As well as additions and retirements there may be thousands of hierarchy changes between successive 'autoclassified' terminology releases. Other non-automated concept migrations may include even top level domain reassignment e.g. from Procedure or Clinical findings to Observable etc.

Post-coordination is no panacea, not least because the meaning of attribute values and value sets are themselves subject to 'versionitis'. Additionally if the 'ontology (concept) model' or attributes are altered in any way, cross version compatibility of affected concepts will be unreliable.


View Dublin Core Metadata Valid XHTML 1.0! ©1998-2015 MOOSe Technology.