Latin Terms - Discussion

Description

Coming from the Translation User Group, various countries would like to collaborate on common terms for Body Structures and Organisms where they are Latin. See https://projects.jira.snomed.org/browse/SCTF-88.

A more general requirement may exist of allowing country extensions to re-use "en" terms, rather than duplicating those homographs in other languages.

Objectives

Enumerate and discuss the various options for working with Latin terms.

Status	Discussion

Discussion

SI Internal Group Discussion 17 January 2024

The SI group felt that leaving the current terms as "en" would avoid confusion, while the introduction of a latin language refset would allow for the identification of these terms (with a view to reuse) in a way that's compatible with existing implementations. The phrase "used in language X" helps move us towards a pragmatic approach that is useful eg in text searching, and away from being stuck on "this text is language X".

The suggestion is that a country would take the latin language reference set and copy those values into their own language reference set. This avoids adding to the complexity of implementations needing to search using a set of language reference sets with graceful fallback. At this point this set of latin terms doesn't actually needs to be a language reference set, it could just be a simple set of descriptions.

Latin terms submitted for addition to the International Edition are expected to "en" (we can't change a Description language code and we're looking to avoid creating descriptions with a language code of "la")

Solution Options:

A language reference set maintained as a community content module. ← preferred solution
Enhance the Refset Tooling to support simple sets of descriptions, and maintain there
Enhance annotations to allow annotations on descriptions.
Use a simple refset of descriptions, with API based tooling - no UI support.

More generally (relating to SCTF-88), SI tooling should move towards allowing descriptions in some arbitrary language to be referenced in any available language reference set. We'll allow this to go back for discussion with the Translation Group before any tooling changes are made, and also before any community content area might be set up.

Note that work is already planned to add language and dialect annotations (or additional relationships) to language refset concepts.

Note that the additional prefix/suffix words in the Body Structure hierarchy (eg Structure of X, X structure) moves terms away from being purely latin . Compare 181700004 |Entire gastrocnemius (body structure)| which contains latin + english, with 7173007 |Cauda equina structure (body structure)| which has a latin term without "structure". Yong thought that having the synonym on the "Structure" concept in the SEP pattern would be more useful.

Option: Move existing latin "en" terms to "la"

This would be illegal as the langCode field has been declared as immutable, so sub-options exist for inactivating "en" descriptions and re-creating as "la", or alternatively leave the "en" terms as they are and create duplicate "la" terms.

Pro: This would be an alignment to our own specification and the original intention for Language Reference Sets that they would indicate acceptability regardless of language of origin.

Con: Effort required and potential disruption to existing implementations.

Question: Could we consider allowing modification of the language code, either as a one-off, or modify the RF2 spec more completely to allow such changes going forward.

Option: Put latin terms into a new module

Pro: this would be a good option if we were to have these terms maintained as community content.

Con: If the "la" terms would be a part of the International Edition, then country extensions would already inherit these terms in any event, and there is no need for the added complexity of a separately versioned module (with associated problems of dependency version compatibility).

Con: Leaving the la terms in the International Module would leave the maintenance burden with the International team.

Tooling Discussion

Authoring Platform Validation is currently set up to enforce alignment between the description language code and a particular set of language reference sets. This effectively forces these homograph terms to be recreated as sort-of-duplicates in other languages. This situation may also relate to ambiguity in the original RF2 specification.

This was partly done as a simplification for the user interface by reducing the number of language reference set drop downs that are displayed for each term. An option here is that an "la"

Issues

Where a language accept header of "en" is specified, we would fail to match an "la" term by default.

If we allow the flexibility of putting - say - English language terms into non-English Language Reference Sets, is the need for identifying Latin terms reduced to the point where we could avoid adding complexity.

Use Cases

References

Project: Medical Latin descriptions

https://projects.jira.snomed.org/browse/SCTF-88

https://iate.europa.eu/home