IHTSDO-409 (artf6245) ICD-O Mapping
SNOMED CT
Content Improvement Project
Inception phase
Project ID: artf6245 https://projects.jira.snomed.org/browse/IHTSDO-409
Topic: SNOMED CT to ICD-O Mapping
Date: July 2016
Version 1.1
Amendment History
Review Timetable
Review date | Responsible owner | Comments |
201606 | Yongsheng Gao | Editorial changes only |
|
| (remove or add rows if necessary) |
© International Health Terminology Standards Development Organisation 2012. All rights reserved.
SNOMED CT® was originally created by the College of American Pathologists.
This document forms part of the International Release of SNOMED CT® distributed by the International Health Terminology Standards Development Organisation (IHTSDO), and is subject to the IHTSDO's SNOMED CT® Affiliate Licence. Details of the SNOMED CT® Affiliate Licence may be found at http://www.ihtsdo.org/our-standards/licensing/.
No part of this document may be reproduced or transmitted in any form or by any means, or stored in any kind of retrieval system, except by an Affiliate of the IHTSDO in accordance with the SNOMED CT® Affiliate Licence. Any modification of this document (including without limitation the removal or modification of this notice) is prohibited without the express written permission of the IHTSDO.
Any copy of this document that is not obtained directly from the IHTSDO [or a Member of the IHTSDO] is not controlled by the IHTSDO, and may have been modified and may be out of date. Any recipient of this document who has received it by other means is encouraged to obtain a copy directly from the IHTSDO [or a Member of the IHTSDO. Details of the Members of the IHTSDO may be found at http://www.ihtsdo.org/members/].
Table of Contents
1 Glossary
1.1 Domain Terms
2 Introduction
2.1 Purpose
2.2 Audience
3 Detailed Problem Statement
3.1 Background
3.2 Statement of Problem
3.2.1 Summary of problem as reported
3.2.2 Summary of requested solution
3.2.3 Detailed analysis of reported problem
3.2.4 Subsidiary and interrelated problems
3.2.5 Stakeholders
3.3 Risks / Benefits
3.3.1 Risks of not addressing the problem
3.3.2 Risks of addressing the problem
4 Requirements: criteria for success/completion
4.1 Criteria for success/completion
4.2 Use cases
4.2.1 Use case 1
4.2.2 Use case 2…
4.3 Test Cases
5 Envision Possible Technical Approaches
5.1 Indicative Technical Approaches
5.1.1 Approach One
5.1.2 Approach Two
6 Indicative Project Plan for Elaboration Phase
6.1 Scope
6.2 Skills required
6.2.1 Solution Specification (Elaboration)
6.2.2 Implementation
6.2.3 Preventing recurrence of problem
6.3 Project size, lifecycle, duration and resource requirements
6.4 Deployment
7 Appendices
7.1 Appendix One : Related user requests
Glossary
Domain Terms
Metastasis |
|
Benign | Non-malignant ie neither invasive nor capable of metastasis |
Pre-malignant | Currently benign but statistically above some clinically significant threshold probability for undergoing transformation to a malignant condition within some specified time period. |
Transformation | The process by which either:
|
Malignant | Possessing the capability to locally invade and to undergo metastasis formerly as evidenced by and therefore also implying that such metastasis has already occurred, but increasingly no longer requiring and thus independent of extant or historic metasasis |
Invasive | Possessing the ability to invade adjacent tissue but not necessarily to also undergo metastasis |
Non-invasive | Not possessing the ability to invade adjacent tissue |
Locally invasive | Possessing the ability to invade adjacent tissue AND so far actually having invaded local tissues but not metastasized |
Carcinoma-in-situ | Possessing cytologic changes typical of invasive lesions (typically, increased mitosis and nuclear polymorphism) but with no evidence of actual invasion or extension to surrounding tissues. It is presumed that, since no invasion has yet occurred, neither can metastasis. |
Uncertain behavior |
|
Metastatic |
|
Non-metastatic | Malignant but lacking a history of proven metastases |
Primary | The tumour mass arising directly from the original malignant cell line transformation, including after invasion but without metastasis. Technically distinct from the broader notion of any contiguous tumour mass that includes the locus of the primary, because this 'true' primary may have since extended into or otherwise merged with originally discrete local secondaries until only one confluent tumour mass exists. |
Secondary | Any discrete tumour mass arising as a result of metastasis from either the primary or another secondary |
Unknown primary | Either
|
Uncertain primary | (Typically of a metastatic neoplastic disease process) the available clinical and laboratory evidence is insufficient to confidently determine the primary site of origin |
Uncertain whether primary or secondary | Any discrete contiguous tumour mass where it is not possible to be certain whether derived directly from the primary locus, also including metastatic elements, or whether an entirely metastatic lesion distant from the primary locus, despite e.g. histological, cytological and genetic testing. |
Uncertain whether metastatic | Either
|
Introduction
Purpose
The purpose of this project is to consider the scope and nature of the work required to revise and update the mapping / data interoperability relationship between SNOMED CT and ICD-O Version 3.
SNOMED CT projects transition from Inception Phase >> Elaboration Phase >> Construction Phase >> Transition Phase. This document describes the Inception Phase. The elaboration phase, in which one or more technical solutions may be developed and tested, may result in more than one document.
The purpose of the Inception Phase is to agree with stakeholders the detail of the problem to be addressed and its scope boundaries. The resulting problem description must also be of sufficient detail such that the size and impact of any resolution might have on the terminology as a whole and its users can be understood.
Subject to adequate review by stakeholders and subsequent revision, the inception phase document becomes the primary input to the Elaboration Phase of the project, in which the potential solutions are considered.
Audience
The audience for this document includes all standards terminology leaders, implementers and users but is especially targeted at those stakeholders from the histopathology and oncology domains, including national Cancer Registries.
Identification of stakeholders
Likely stakeholders include:
World Health Organisation (unnamed link ; unnamed link)
AJCC American Joint Cancer Committee (ajcc@facs.org)
UICC Union Internationale Contre le Cancer (info@uicc.org)
FIGO International Federation of Gynaecology and Obstetrics
NCI Natonal Cancer Institute (Peggy Adamo unnamed link)
International Associated Of Cancer Registries (iacr@iarc.fr)
Pathology laboratories (cascade through NRCs)
IHTSDO iPALM Special Interest Group
Laboratory Information and Management System (LIMS) vendors
Clinical oncologists
Input from stakeholders
Consulting with the stakeholders listed above will necessarily be a slow process. Because of the number of standards bodies involved, it may be more appropriate for such a consultation to be driven at a higher level than an individual Consultant Terminologist, e.g. the IHTSDO Head of Terminology could request the IHTSDO-WHO Joint Advisory Group to consider the full set of representational issues in the oncology domain and then lead a coordinated international effort to resolve them.
Degree of consensus on the statement of problem
Not known at this stage (stakeholder consultation still pending).
Statement of the problem or need
Summary of problem or need, as reported
The issue as originally logged to the content tracker in March 2010 was expressed only in terms of an outline solution (reproduced below in s3.2). The underlying problem(s) was, therefore, only implied but appears to be:
the quality of the current mapping (mirroring) relationship between SNOMED CT and ICD-O must be improved, mostly to correct omissions
the collaboration process by which that mapping/mirror is maintained should be strengthened
the fundamental nature of the mapping/mirroring relationship could be reconsidered.
A supplementary note posted to the tracker item later that year (August 2010) provides further context to the supposed problem:
With the signing of the agreement with WHO, the first meeting of the Joint Coordination Group identified coordination of ICD-O with SNOMED morphology as one of the 5 work items that the cooperative work will be addressing.
This note refers to the joint agreement signed on 25th July 2010 between IHTSDO and WHO. The agreement commits both parties to a sustained collaborationtoward the ultimate goal of achieving a harmonized integration between SNOMED and all WHO-FIC classifications. The continuing collaboration is today managed by a Joint Advisory Group (JAG).
When the collaboration started, back in 2010, the JAG ultimately identified two potential work streams to be their highest priority: improving the map from SNOMED CT to ICD-10, and developing a common ontological framework to underpin both SNOMED CT and ICD-11 so that, ultimately, ICD-11 might be formally expressed and maintained as a 'view' on SNOMED CT rather than as an entirely different system.
However the JAG clearly also considered (but graded of lower priority) the potential for a collaborative workstream tasked with re-examining the relationship between ICD-O and SNOMED CT: both the current relationship, and how they should interrelate in future. The solution outlined in the original tracker entry (below) suggests a working assumption that SNOMED CT should continue to closely mirror the ICD-O-3 Morphology and Topography chapters.
Summary of requested solution
The tracker item outlines a possible four-part solution:
Updating the mapping of SNOMED CT Body structures to ICD-O-3 topographies.
Updating the mapping of SNOMED CT cancer related (M-8 and M-9 SNOMEDIds) morphologic abnormality concepts to ICD-O-3 where appropriate. For example, for several years there were_ two morphology entities that existed in ICD-O (as published by WHO) but for which no corresponding concept existed in SNOMED CT with the analogous SNOMEDID (both omissions were corrected in the July 2011 release of SNOMED CT):
8155/1 Vipoma, NOS and 8935/0 Stromal tumor, benign
Additionally, out of 4,528 concepts in the descent of SNOMED's 400177003|Neoplasm and/or hamartoma (morphologic abnormality)|, at least 876 (19.3%) have a SNOMEDID that did not appear to be a valid ICD-O morphology code analog.
Identifying morphologic abnormalities in SNOMED CT which have non-synonymous descriptions from ICD-O-3. Flagging those descriptions as non-synonymous and creating them as new subtype or sibling concepts with appropriate ICD-O-3 mappings. e.g. apparent duplicate mappings of: vipoma (447643008 -> 8155/1, vs 31131002 -> 8155/3)
proliferating trichilemmal tumor (128638007 -> 8103/0, vs 446023005 -> 8103/1).
These may be instances of synonyms needing to be retired as inappropriate. Is "vipoma" a valid synonym of "vipoma, malignant"? Is "proliferating trichilemmal tumor" a valid synonym of "pilar tumor"?
Appropriate FSNs for content with M-8 and M-9 SNOMEDIds so that the FSNs mirror the intended ICD-O meaning.
Statement of problem as understood
Some Topography codes in ICDO-3 are not currently matched by a semantically equivalent SNOMED CT code or expression, and/or the equivalence is not always published.
Some Morphology codes in ICDO-3 are not currently matched by a semantically equivalent SNOMED CT code or expression, and/or the equivalence is not always published.
A significant delay often occurs occur between ICDO-3 or SNOMED publishing new codes, and the corresponding code equivalence tables being appropriately updated and published by the other party.
For some ICD-O codes (especially morphology codes), the SNOMED CT concept stated as their semantic equivalent may have current descriptions that are (a) not genuine mutual synonyms of one another and/or (b) do not appropriately mirror the meaning, if not also the actual text, of the designated ICD-O equivalent.
Because of the above failings, data originally captured using either SNOMED CT or ICD-O can not always freely interoperate by means of unambiguous, comprehensive, fully maintained and professionally tables of equivalence.
A more comprehensive and timely alignment of content is desirable, to be published as
a many:one unidirectional SNOMED-to-ICD-O map for Topography
a one:one bidirectional map for Morphology
Detailed analysis of reported problem, including background
Background
ICD-O
ICD-O was first published by the World Health Organisation in 1976. It is the standard tool used globally for coding neoplastic diagnoses (whether benign or malignant), both by pathology laboratories and by the tumour and cancer registries they report to. Version 2 followed in 1990, and version 3 in 2001.
Since its first version, ICD-O has been conceived as a relatively crude compositional scheme: neoplastic diseases are independently coded along two primary axes (topography and morphology) of which one (morphology) may be further characterized along two further subsidiary axes (behavior and grade). A full ICD-O encoding, therefore, comprises a concatenation of four subcodes:
2-3 digit topography code (anatomical location of primary tumour site; revised in ICD-O version 2. Code Range C00-C80.9; there are currently 400 distinct site codes in ICD-O-3)
4 digit histology / morphology code (Code range: M8000-M9989; originally only histopathological type but, since version 3, also includes other techniques for tumour typing such as cytogenetic markers. There are 765 distinct morphology codes in ICD-O-2)
1 digit behaviour code (modifies morphology code)
*Not used by cancer registries (used by some pathologists in some parts of the world)
1 digit 'aggression' code (degree of differentiation or grade of the tumour; also modifies the morphology code)
Code | Grade | Differentiation |
1 | Grade I | Well differentiated or Differentiated, NOS |
2 | Grade II | Moderately differentiated, intermediate differentiation |
3 | Grade III | Poorly differentiated |
4 | Grade IV | Undifferentiated, Anaplastic |
9 | Grade of differentiation not determined, stated or applicable |
In principle, therefore, ICD-O-3 allows for in excess of 9.1 million distinct combinations of a morphology, topography, behaviour and grade code. In practice, of course, many combinations are impossible or highly improbable (e.g. M-8631/01 C70.1 : Benign, well differentiated Sertoli-Leydig cell tumour of frontal lobe). Within its overarching compositional framework, therefore, ICD-O-3 is in practice more commonly delivered by WHO as a list of 2271 well known diagnostic labels, each of which is given a centrally assigned, normative, precoordinated morphology and behaviour code (but not a Topography code). Thus, if the expert consensus is that the clinical entity called 'Burkitt's Lymphoma' is always malignant, then ICD-O provides:
9687/3 Burkitt Lymphoma
However, if such a diagnosis is confirmed histologically (as a 9687) but the subsequently observed behaviour of the specific tumour instance is other than the usual malignant, it is permissible to record (for example) 9687/2 even though this code was absent from the official distribution.
Some have, therefore, attempted to further enumerate the clinically plausible cross-product of Morphologies with Topographies and also to extend the set of recognized synonyms. These endeavours typically arrive at an enumerated list of some 9000 Topography+Morphology+Behaviour+Term combinations for the clinically plausible oncology domain. See e.g. http://www.seer.cancer.gov/icd-o-3/
Changes in ICD-O
Like all clinical terminologies, ICD-O has had to change its content as medical knowledge has evolved and the expert consensus changes. The most recent content changes, from version 2 to 3, included:
Some tumours previously graded as borderline (behavior code:1) are now reclassified as malignant (code:3). Similarly, some previously malignant (3) are now officially benign (1). For example:
TERM | ICDO2 | ICDO3 |
|---|---|---|
Neurocytoma | M9506/0 | M9506/1 |
Desmoplastic fibroma | M8823/1 | M8823/0 |
Myelosclerosis with myeloid metaplasia | M9961/1 | M9961/3 |
Papillary meningioma | M9538/1 | M9538/3 |
Papillary ependymoma | M9393/1 | M9393/3 |
Endolymphatic stromal myosis | M8931/1 | M8931/3 |
Pilocytic astrocytoma | M9421/3 | M9421/1 |
Papillary mucinous cystadenoma, borderline malignancy | M8473/3 | M8473/1 |
Mucinous cystadenoma, borderline malignancy | M8472/3 | M8472/1 |
Papillary serous cystadenoma, borderline malignancy | M8462/3 | M8462/1 |
Papillary cystadenoma, borderline malignancy | M8451/3 | M8451/1 |
Serous cystadenoma, borderline malignancy | M8442/3 | M8442/1 |
203 new 4-digit Morphology codes not previously in ICD-O-2 were added to ICD-O-3
97 4-digit Morphology codes previously in ICD-O-2 were removed from ICD-O-3
1032 precoordinated 5-digit Morphology+Behaviour code combinations not previously in ICD-O-2 were added in ICD-O-3
849 precoordinated 5-digit Morphology+Behaviour code combinations previously found in ICD-O-2 are removed from ICD-O-3
More acronyms are now recognized as synonym terms; some terms are now marked obsolete.
Some terms previously present with UK English spelling (e.g. 'Polycythaemia rubra vera ') have been removed in favour of a US English spelling (e.g. 'Polycythemia rubra vera')
The classification of haematological diseases has been significantly re-worked, grouping them according to the REAL classification(ie along cell lines rather than on gross morphological features) and with some terms offering detailed differentiation by cytogenetic markers:
9866/3 = acute promyelocytic leukaemia t(15;17)(q22;q11-12)
=translocation (t) of material from long arm (q) of chromosome 15, region 22 with material from the long arm (q) of chromosome 17 between region 11 and 12
Some haematological diseases are also now described by molecular abnormalities:
9866/3 = acute promyelocytic leukaemia t(15;17)(q22;q11-12) or acute promyelocytic leukaemia PML/RAR-alpha
(Mis-)usage of ICD-O, and ICD-O internal inconsistencies
Although designed and intended to be used as a compositional system (ie where a full ICD-O encoding should always contain both a morphology and a topography code as a minimum) in practice many laboratory sites omit the Topography coding component altogether, instead opting to record only the Morphology element. This is perhaps understandable when many histopathological morphologies can either only ever arise from one anatomical locus, or so very rarely arise anywhere else that the default assumption is very rarely wrong. For example, the anatomical location of the primary is somewhat redundant to state if the morphology is given by any of:
8631/0 | Sertoli-Leydig cell tumour |
8042/3 | oat cell carcinoma |
8162/3 | Klatskin's tumour |
In other cases the ICD-O morphology code itself appears to already explicitly specify the locus, such that the additional use of a Topography code can appear to be genuinely entirely redundant:
8541/3 | Paget's disease and infiltrating duct carcinoma of breast |
8161/0 | bile duct cystadenoma |
8161/3 | bile duct cystadenocarcinoma |
8170/0 | liver cell adenoma |
SNOMED and ICD-O
In common with several other clinical terminologies in use in the 1990s, and also probably because of SNOMED's origins as SNOP (a coding scheme specifically for histopathology), SNOMED sought to facilitate its crossmapping to ICD-O.
Being itself an early compositional system, and sharing ICD-Os ontological view that many diseases could be modeled as 'some histopathological phenomenon at some anatomical site', legacy versions of SNOMED (SNOMED 3 International, SNOMED RT) chose to closely mirror the ICD-O classification. In particular, the M-8000 to M-9989 codes in the morphology axis of SNOMED 3 were, by agreement with WHO, engineered to be a straight copy of the morphology classification in ICD-O version 2. This mirroring extended to the codestring level: the ICD-O codestring for a given Morphology-plus-Grade code combination could – originally - be trivially lexically transformed into the exactly corresponding SNOMED RT code and vice versa: ICD-O-2 M-ABDC/E becomes M-ABCDE in SNOMED-3 International, where E is the behaviour code element of the ICD-O encoding.
However, changes in SNOMED from 1993 onward - culminating in SNOMED-RT - led to identifier and content incompatibilities, particularly for non-neoplastic lesions. As a result, the electronic ICD-O-3 release files from WHO no longer includes explicit maps to SNOMED-RT codes. Since 2000, the mapping relationship between ICD-O and SNOMED CT has diverged further.
For example, the morphology code changes that occurred between ICD-O versions 2 and 3 were handled in SNOMED-CT by retiring those SNOMED codes corresponding to the (now deprecated) ICD-O-2 codes, then creating new SNOMED concepts for the new ICD-O-3 codes, and finally (where appropriate) inserting history relationships to link one to the other. Thus, for example, to reflect ICD-O-3's reclassification of PCV as a malignant condition, the SNOMED code:
31569001|Polycythemia vera (morphologic abnormality)| SNOMEDID = M-99501
…is now retired, and REPLACED_BY:
128841001|Polycythemia vera (morphologic abnormality)| SNOMEDID = M-99503
Current encoding of map from SNOMED CT to ICD-O Topography
In the same way that the ICD-O-3 release files no longer include an explicit map to SNOMED CT, so the core SNOMED CT table (sct1_concept) no longer guarantees either that all ICD-O-3 Topography or Morphology codes that currently exist will also appear somewhere in the SNOMEDID column or, where they do, that they're necessarily attached to the right underlying concept.
For this reason, it is no longer valid to crossmap to ICD-O by naïve code lookup on the SNOMEDID element of the sct1_concept table. This technique will accordingly become essentially impossible to use on RF2 distributions of SNOMED CT, within which there is no SNOMEDID column.
Instead, the International Edition of SNOMED CT (in either RF1 and RF2 release format) now contains a single crossmap table (mapSetID 102041) which encodes a mapping between 22,763 SNOMED CT anatomy concepts and 287 ICD-O-3 Topography codes, and between 1191 SNOMED CT Morphological Abnormality concepts and 1122 ICD-O-3 Morphology codes. This is a unidirectional map; it doesn't support reverse mapping from an ICD-O topography code to SNOMED CT. The map topology is, in fact, many:many rather than a strict many:one map: a minority (n=136) of SNOMED anatomy concepts have more than one possible ICD-O map:
|
|
|
| Option | Priority |
|---|---|---|---|---|---|
66150005 | Structure of cricothyroid ligament (body structure) | C32.0 | Glottis | 0 | 0 |