2023-10-24 - TRAG Meeting Agenda/Minutes

2023-10-24 - TRAG Meeting Agenda/Minutes

Date

  • Tuesday 24th October 2023  -  13:30 - 17:30 (EDT) (17:30 - 21:30 UTC)

Room: Muse 1 (Starling Hotel, Atlanta)





Attendees

  • @Andrew Atkinson, Chair

  • @Mounir Bouzanih (Unlicensed), member

  • @Mikael Nyström (Unlicensed), member

  • @Patrick McLaughlin, member

  • @Alejandro Lopez Osornio, member

  • @Stuart Abbott (Unlicensed), member

  • @Matt Cordell, member

  • @Gábor Nagy (Unlicensed) , member

  • @Dion McMurtrie, guest/observer

  • @michael lawley, guest/observer

  • @Reuben Daniels, guest/observer

  • @Chris Morris, staff/observerst

  • @Maria Braithwaite , staff/observerst

  • @Janice Spence Observer

Apologies

  • @Former user (Deleted)member


Objectives

  • Briefly discuss each item

  • Agree on the plan to analyse and resolve each issue, and document the Action points

  • All those with Action points assigned to them to agree to complete them before the next face to face conference meeting

Discussion items



Subject

Owner

Notes & Actions

Subject

Owner

Notes & Actions

1

Welcome!

All

Thanks to our members for all of their help. Welcome to our observers!

INTRODUCTIONS...

We've got several topics that we've resolved and closed down

As always, we won't waste time going through them again in detail, but if you'd like to read through them they're listed below...  

I'll also run through them very quickly from a high level, and if you have any further questions/news on any of the discussions please let me know now and we can decide whether or not to re-open them...

2

Conclusion of previous Discussions topics

3

The possibility of updating inactive content

All

https://projects.jira.snomed.org/browse/MSSP-1670

Please see the ticket above for full explanation - in brief:

  • Descriptions in the 20220731 International edition snapshot description file appear to contain ASCII Character 160 for Non-breaking space, when the character should be ASCII 32 for a standard space. ASCII Character 160 could potentially create issues with ETL processes.

  • The suggestion is that issues caused by non-printable ASCII / UNICODE / UTF-8 characters need to be covered under their own policy because simple inactivation does not resolve the issues caused by these characters in ETL and interoperability processes.

  • Unfortunately removing these characters from inactive content contravenes our current policy, which is to only update inactive content (whether this be via the AP or via a back-end fix by the tech team) where a "critical issue" has been found. The term "critical" is used specifically to clearly denote only those issues which present risks such as clinical patient-safety or legal liability, for example.  Therefore, in order for us to flag up these inactive records as a clinical safety issue, we'd need evidence of reports from users explaining how they present such a risk to their patients.  

  • Confirmed by the content team that validation for non-breaking spaces is in place already for active content, and so no improvement to validation is required.

  • From what we can tell, many of these have been in the release for years now, but we have not received any feedback that it has caused an issue thus far. This is therefore not a "critical" issue - however we'd appreciate community to confirm if there would be any issues with making the fixes directly on inactive descriptions?

  • The following Descriptions in the 20220731 International edition snapshot description file were found to contain ASCII Character 160 for Non-breaking space, when the character should be ASCII 32 for a standard space. ASCII Character 160 creates issues with many ETL processes.

  • All of these issue are in inactive descriptions:



  • ID Column Issue
    2869833013 [term] Code:160, Position:27
    2870804019 [term] Code:160, Position:27
    2871691013 [term] Code:160, Position:16
    2880511019 [term] Code:160, Position:21
    2880958016 [term] Code:160, Position:118|Code:160, Position:149
    2881152012 [term] Code:160, Position:118|Code:160, Position:124
    2882107012 [term] Code:160, Position:118|Code:160, Position:149
    2882999016 [term] Code:160, Position:118|Code:160, Position:124
    2884068015 [term] Code:160, Position:21
    3030804017 [term] Code:160, Position:30
    3030901012 [term] Code:160, Position:30

  • The suggestion is that "issues caused by non-printable ASCII / UNICODE / UTF-8 characters need to be covered under their own policy because simple inactivation does not resolve the issues caused by these characters in ETL and interoperability processes. Given the amount of inactive SNOMED content present in the data stream, it would be best if these characters could be removed entirely from even inactive descriptions. While working in healthcare implementations, the presence of ASCII Character 160 (non-breaking space) in the LOINC descriptions broke the entire ETL process between the data warehouse and Research databases and required me to jump through some programming hoops to remove these characters from the LOINC descriptions."

  • Whilst we appreciate the impact that these characters might have on ETL processes,  from a content team perspective, this is not a critical issue.  All of the current issues are related to inactive descriptions and validations are in place to prevent this from occurring in the future. 

  • Unfortunately removing these characters from inactive content contravenes current SI policy, which is to only update inactive content (whether this be via the AP or via a back-end fix by the tech team) where a "critical issue" has been found.  The term "critical" is used specifically to clearly denote only those issues which present risks such as clinical patient-safety or legal liability, for example.  Therefore, in order for us to flag up these inactive records as a clinical safety issue, we'd need evidence of reports from users explaining how they present such a risk to their patients.  

  • SI are always reluctant to change SNOMED CT history. However, there are situations where we have had to do that in the past. We are therefore bringing this to the TRAG for consideration...

  • A couple of people thought it might be easier to update the inactive content rather than getting repeated complaints over the years - however the vast majority disagreed, and thought that not only was it a waste of valuable resource to update inactive content, but more importantly actually contravened the spec at this level!  This is because the INT Edition specifies itself as a UTF-8 format, and the ASCII 160 characters are UTF-8 compliant!  Therefore where would we stop once we start excluding certain UTF-8 characters from the INT Edition? 

  • Instead, it should be the responsibility of the end implementations to exclude any characters that disagree with their ETL routines/programs.

  • Repsonse added to https://projects.jira.snomed.org/browse/MSSP-1670

  • TRAG RECOMMENDATION WAS TAKEN - our SI specs specify UTF-8 format, and as ASCII 160 characters in question are UTF-8 compliant, any changes would be contravening our own specifications.  Therefore all were agreed that no changes should be made in the content - instead it should be the responsibility of the end implementations to exclude any UTF-8 characters that cause issues for their ETL routines.

4

Community Consulation: Proposed changes to the RF2 Identifier File Specification

All

Full details can be found in:  SNOMED International Proposal to change the RF2 Identifier File specification

Main points for TRAG consideration:

  • Proposed Changes to RF2 - any issues with the proposal?

    The current column headers for this file are:

    The new proposed format is:

    The data types will remain the same, as detailed in the current RF2 specification:  4.2.4 Identifier File Specification   

  • Perceived Impact - is this the case?

    This change is not expected to have any impact on implementers of existing systems, as SNOMED International are not aware of organisations who currently represent entities from other code systems directly in SNOMED CT, as opposed to mapping to it.   As such, consumption of the new file would only be required by organisations who have an interest is working with such content.

  • FEEDBACK - Walk through the feedback received so far and confirm if any further points?

Feedback requested:

Feedback on the File changes was varied, but generally speaking there were no strong objections to the changes to the file.
HOWEVER, there were strong objections to the overrall plan to publish LOINC as a separate "Extension".
This is due to the additional Friction caused by having yet another component in a separate package.
Implementers would greatly prefer it to all be published in the same package as the International content.
Having spoken to Rory, this is a CONTRACTUAL issue - we cannot align the SNOMED CT licence with the LOINC licence in order to publish both types of content in the same package!
This is therefore the ONLY option - we have to publish both the LOINC Identifier file + the LOINC content itself in a separate Extension package, dependent on the International Edition.
 
So we now need to go back to the intended changes to the Identifier file format, and confirm whether or not these are acceptable to everyone?
Initial feedback:
We're making it "look" like the other RF2 files, but it's not!  The Identifier column is NOT a primary key as you'd expect, as in other files with UUID's (even though they also technically have compound keys such as UUID+moduleID+active, etc)
We'd have to make the "ReferencedComponentID" field mutable, as otherwise when a mistake is made and we need to change this field to another ID, we have no other option than to create a DUPLICATE record which has everything the same except for active+ReferencedComponentID.
This shouldn't be too much of a problem though as we can make the ReferenceComponentID field mutable if we need to
Most people would prefer to use a Refset instead in order to be more flexible
We could have a unique primary key (like a UUID)
We could express one-one and one-many relationships, etc
URI attributes such as Concrete Domains coudl be a much more useful addition to the identifier file?
.
 
5

AttributeValue field immutability in the RF2 files

ALL

Just a very quick one (especially for those who were in the MAG yesterday and have already heard this!) - the immutability of the valueID field is specified as being "depends on specific use" - see here:

The MAG are all happy to change this to "mutable", and so are we - however I just wanted to give those here who weren't in the MAG a chance to raise a valid objection in case anyone can identify a really strong reason why this field shouldn't be mutable??

No objections raised

6

Active Discussions for October 2023



7

Welcome and thank you!



Welcome to new members!

8

Member Nominations



Please let us know if anyone is interested (and who has the requisite domain knowledge and expertise) in applying for a seat on the TRAG - thanks!

We're looking for new members to take the place of some outgoing chairs - if you have any Nominations please let me know either this week or by email after.   Thanks!

9

Proposal to deprecate the Concept Non-Current (CNC) Indicators

ALL

See Peter Williams' proposal on the subject:

The Case For Removing Description Concept Non-Current Indicators

The key points of the proposal that are salient to our group are:
TODAY                                   = Discuss whether or not there are any known users still using the CNC indicators?
                                                 = Discuss whether there are any valid use cases still in existence to retain them? (whether in use or not)
                                                 = Are there any dissenting arguments against the assertion that CNC indicators are completely
                                                   obsolete, and that it's more efficient and reliable to determine the relevant Concept's state from the
                                                   Concept record, rather than from the AttributeValue record?
                                                = Are there any dissenting arguments against the assertion that CNC indicators cause additional work
                                                   for all creators of SNOMED CT content, in terms of maintenance, packaging and validation?
                                                = Are there any dissenting arguments against the assertion that the removal of CNC indicators 
                                                   will serve to simplify the understanding of SNOMED CT, + help to lower barriers to adoption?
                                                 = Discuss any impacts to the terminology, or to any users, when removing the CNC indicators?
                                                   (beyond our internal impact, which is restricted solely to removing/simplifying existing code)
                                                 = Discuss who, if anyone, we should specifically target for feedback on the proposal?
                                                = Agreement in principle of the deprecation of CNC indicators by the TRAG.
 
PROPOSED TIMELINE (updated)
December 2023                   = Publication of an official Notice of Deprecation, and request for feedback
July 2024 INT Edition         = Inactivation of all existing CNC indicators,    
                                                         + Removal of all code creating new CNC indicators
                                                         + Removal of all code relating to the display and validation of CNC indicators
Future                                     = We’re not at any point suggesting surgical extraction of the historical CNC indicators (using negative delta's or any other such mechanism!)  Just inactivation and keeping them static in perpetuity after that.
HOWEVER, given that they consume 108mb of the International Edition, is there an argument for complete removal?
YES, we can inactivate them for 6-12 months then REMOVE these records COMPLETELY (not AttributeValue files)
WE SHOULD ALSO REMOVE THE ENTIRE STATEDRELATIONSHIP FILES WHICH HAVE BEEN INACTIVATED FOR 5 years or so now
WE SHOULD PROBABLY GIVE THE USERS 12-18 MONTHS FOR COMPLETE REMOVAL
@Andrew Atkinson  TO WRITE UP SOME COMMS AND SEND OUT ONCE APPROVED INTERNALLY....
.
Any other Feedback?
10

Annotations - outcome of today's MAG discussions



  1. Language code to be removed from the implementation entirely - they're extraneous at present, and no valid use case is apparent at present.  Therefore the preference is to simplify the current implementation wherever possible, and manage any language considerations using the moduleID for now.

  2. Agreement on the distinction between data and metadata - for now we will use the new refsets to annotate metadata only, and data will be addressed using additional relationships.  Only remaining question will then be how to distinguish between what is technically data vs metadata!

11

Annotations - Language Code 

ALL

Hi @Matt Cordell @Dion McMurtrie @michael lawley @Alejandro Lopez Osornio @Mounir Bouzanih (Unlicensed) @Patrick McLaughlin @Mikael Nyström (Unlicensed) @Stuart Abbott (Unlicensed) @Gábor Nagy (Unlicensed) @Reuben Daniels 

There are several options for managing the Language code -

  • see the options here:  Annotations review 

  • Plus also see the "Representation of annotation data type" section of the MAG proposal: SNOMED CT Annotations (this is because the other option was to include "@language" in the "annotationValue" field, which would be a problematic idea for implementations as the entire field would have to be parsed each time in order to extract the language code) 

Please provide feedback asap before the next TRAG meeting in October, so that we can try to unblock development.  thanks!
Decision made to use the "@[language]" (eg. "@en") in the "annotationValue" field
***** BUT THEN THE MAG JUST MADE A NEW (unanimous) DECISION - TO REMOVE ALL LANGUAGE CODES FROM THIS IMPLEMENTATION (as they're extraneous at present, and no valid use case is apparent at present)
Any other major concerns with the decision made to remove the language code completely?
YES - Mikael has strong objections so we took a vote and adding a new Column (which would be empty (NOT null) when not required) won 8 votes to 5
12

Annotations - file naming conventions

ALL

Now that we've got the Language Code approach confirmed, we need to agree the file naming conventions for the 4x new refset types that will be included in the International Edition going forwards (although only the 2x String Value refsets will now be introduced initially - see below).

The four planned refsets are as follows:

  • 1292996001 |Member annotation with component value reference set (foundation metadata concept)|

  • 1292995002 |Member annotation with string value reference set (foundation metadata concept)|

  • 1292994003 |Component annotation with component value reference set (foundation metadata concept)|

  • 1292992004 |Component annotation with string value reference set (foundation metadata concept)| 

So they will be held in the /Refset/Content /Refset/Metadata subfolders, and the initial naming convention would suggest something along the lines of:

  • der2_sscsRefset_MemberAnnotationComponentValueSnapshot_INT_[date].txt

  • der2_sscsRefset_MemberAnnotationStringValueSnapshot_INT_[date].txt

  • der2_scsRefset_ComponentAnnotationComponentValueSnapshot_INT_[date].txt

  • der2_scsRefset_ComponentAnnotationStringValueSnapshot_INT_[date].txt

However we're happy to take any feedback on these before finalising them?

FYI We've decided to release the first 2x definite refsets (as empty files only for now) into the December 2023 Release, in order to trial it and get people used to them - we may introduce the other 2x in future releases if required:

  • der2_sscsRefset_MemberAnnotationStringValueSnapshot_INT_[date].txt

  • der2_scsRefset_ComponentAnnotationStringValueSnapshot_INT_[date].txt



In addition to this, we need to roll out the new refsets to all SNOMED Products - for example, National extensions will create their own annotation attributes and values in addition to those we have covered in the annotation document, such as information about medicinal products.   Other examples include the category annotation attribute for LOINC, which could belong to the LOINC module. 

We would therefore be looking to use conventions like this for Extensions:

  • der2_sscsRefset_MemberAnnotationComponentValueSnapshot_[CountryCode + Namespace]_[date].txt

  • der2_sscsRefset_MemberAnnotationStringValueSnapshot_[CountryCode + Namespace]_[date].txt

  • der2_scsRefset_ComponentAnnotationComponentValueSnapshot_[CountryCode + Namespace]_[date].txt

  • der2_scsRefset_ComponentAnnotationStringValueSnapshot_[CountryCode + Namespace]_[date].txt

And conventions like this for Derivatives:

  • der2_sscsRefset_[derivative name]MemberAnnotationComponentValueSnapshot_INT_[date].txt

  • der2_sscsRefset_[derivative name]MemberAnnotationStringValueSnapshot_INT_[date].txt

  • der2_scsRefset_[derivative name]ComponentAnnotationComponentValueSnapshot_INT_[date].txt

  • der2_scsRefset_[derivative name]ComponentAnnotationStringValueSnapshot_INT_[date].txt

(eg)

  • der2_sscsRefset_OrphanetMemberAnnotationComponentValueSnapshot_INT_20240131.txt

  • der2_sscsRefset_OrphanetMemberAnnotationStringValueSnapshot_INT_20240131.txt

  • der2_scsRefset_OrphanetComponentAnnotationComponentValueSnapshot_INT_20240131.txt

  • der2_scsRefset_OrphanetComponentAnnotationStringValueSnapshot_INT_20240131.txt

Again, any concerns or suggestions?
NO - @Andrew Atkinson to write these up in Comms to go out to community in Release Notes
13

Annotations - Refset File types

ALL

The Primary Use Cases for Annotations have been provided as follows:

All of the above examples appear to fall nicely within the new "METADATA" definition for Annotations...
Except, perhaps, for the last case??
Thoughts??

The new Annotations Refsets do not conform to any of the existing Refset types/patterns:

We likely therefore need to agree on a new type/format - this will be discussed first in the MAG in the morning and then in the TRAG in the afternoon, with the aim to agree new refset types in these meetings so that they can be used from the December 2023 International Edition Release onwards, and also to create the necessary documentation for the new Refset types in Confluence (as we did for the last new Refset type - the OWL Expression refsets:  

New types / formats agreed?

Do we need to DEPRECATE the earlier versions of the Annotations refset here  5.2.1.6 DEPRECATED: Annotation Reference Set ?

 
Refset Type formats:
Additional fields for the Member Annotation Refset (created to support annotations on members of any refsets):
refsetId                               - Identifies the reference set to which this reference set member belongs. In this case, a subtype descendant of |Member annotation type reference set|.
referencedComponentId - A referred the referencedComponentId in the referencedMember entry in a refset.
referencedMemberId       - A reference to the UUID of a member in a reference set. The entity to which the annotation is being applied.
Annotation                          - Any descendant of 900000000000459000 |Attribute type (foundation metadata concept)| in the metadata hierarchy.
.
Additional fields for the Component Annotation Refset (created to allow annotations to be assigned to any SNOMED CT component):
refsetId                               - Identifies the reference set to which this reference set member belongs. In this case, a subtype descendant of |Member annotation type reference set|.
referencedComponentId - A referred the referencedComponentId in the referencedMember entry in a refset.
Annotation                          - Any descendant of 900000000000459000 |Attribute type (foundation metadata concept)| in the metadata hierarchy.
.
Documentation complete?
NO - @Andrew Atkinson to complete proposed Specs and send out internally for review, before sending to TRAG for final review
14

Annotations - refsetDescriptor records

ALL

Once we've agreed the filenaming conventions, we also need to quickly confirm that everyone is happy with the Attribute Types + Descriptions that will be applied to them in the refsetDescriptor files - as this is all automated now, and so I need to verify that it creates the refsetDescriptor records with the desired attribute types, descriptions, etc 

Please let us know if we missed anything or if there are any perceived issues?
NO - ALL GOOD - @Andrew Atkinson to use these in the Specs, and then onwards in the December 2023 International Edition Release onwards
15

Annotations - Additional Relationship file

ALL

Question - do we need to rush the Additional Relationship file into the December/January release urgently (in order to get Language tag in)? 
.
Or can we put this in a future release once we've got the refsets in now?
.
AAT to take this offline and put a proposal together to get the Additional Relationships file (same format as ConcreteValues file) + some technical content (for initial use cases like language tags etc - see Australia) in as soon as possible
16

Annotations - Release Dates

ALL

The release date depends on two factors:

  1. Firstly, the authoring platform being ready for annotations (which will be November 2023)

  2. Secondly, content authored for annotations. Language tags only involve about 8 concepts under 900000000000506000 |Language type reference set (foundation metadata concept)|. The attribution could be done by technical batch changes.

Because of the change to release date (to the 1st of the month), it is therefore unlikely to be possible to prepare everything in time for the December 2023 release. We are therefore currently aiming for the January 2024 releases - as this will also provide the largest audience for these major new changes (as opposed to Feb/March which most users still do no consume).

Is everyone ready for these changes, and comfortable with the proposed release date of January 2024 onwards?
If so, we will publish the EMPTY files in the December 2023 release as a trail run, and then populate them with the first content in the January 2024 International Edition.
This will likely be closely followed by more content being introduced in the various Extensions over the next few months:
Anyone in the room intending to include any data in them anytime soon?
If so, have we covered off all necessary bases?
Any other considerations?
NO - Everyone happy with the proposed timelines
17

Annotations - Validation

All

What validation, if any do we need?
MUST BE UTF8 compliant (and we need to lock down what we mean by this as means different things to different people - see Peter)
Exsiting refset validation for COMPONENT ID's (or UUID's) - (doesn't have to be Active, just EXISTING component)
non -empty Annotation validation
@Andrew Atkinson to write up in tickets and request Dev work ASAP
Anyone already have assertions they'd like to donate?
18

IPS Terminology Product

All

Quick run through of the changes that we're proposing to make in the final Production release in Q4 2022, as compared to the BETA release

(ie) discussion of the feedback that we accepted and have implemented in the Production release:

  1.  INCLUSION OF THE “EML” (new Drugs refset) IN THE FEEDER FOR THIS PRODUCT FROM 2022 ONWARDS

  2. IPS Terminology URI:

  • The terminology code system & code system version used to refer to the IPS Terminology are shown below:

Copyright © 2026, SNOMED International