Working with Historical Associations
Background
In 2017 Q1, it was clarified in the TIG that the targetComponentId of the Association Refset file is to be considered immutable. Tooling changes were subsequently made to honour this when inactivating concepts, however these changes contained the assumption that the target component must always be active and this has some implications that warrant further consideration.
Proposal
If we accept the choice that's been made to make end users' lives easier by maintaining associations to the most relevant active concept(s) for any given inactive concept, then we can avoid a snowballing snapshot file bloat by allowing the reuse of refset rows. That is, keeping the same UUID and modifying the target.
This field is currently marked as Non-Mutable in the Implementation Guide here: 5.2.1.4 Association Reference Set
Resolution by MAG 15 Oct 2018: TIG is currently ambiguous between text and colour. Confirmed that historical association target field should be mutable. Existing validation can be removed, to allow such re-use.
The text below was written by David Markwell, in response to an email question:
Considerations when inactivating / replacing Historical Associations
My personal view on this is that history should not be rewritten and in that regard I agree with Michael Lawley's comment. However, I am not sure of the arguments that led to this change. You are correct in the sense that anyone with the full release could trace the changes but there is a potential anomaly created by shifting to direct links when in historically the changes occured stepwise. This was the argument in the past for not changing the SAME AS associations even when when the target became inactive ... instead allowing the indirect transitivity approach to prevail.
The argument is as follows:
Part 1: The SAME_AS --> SAME_AS case
2002
The concepts A, B and C were created and not recognized as duplicates.
2003
The concept A is inactivated as a duplicate of concept B. Historical SAME_AS association concept A to concept B.
2004
The concept B is inactivated as a duplicate of concept C. Historical SAME_AS association concept B to concept C.
Now there are four possibilities:
1.1) Leave historical association of A unchanged
Referential integrity maintained by transitivity as existing association points to B ... which is inactive and SAME_AS points to C.
Results in some added complexity.
1.2) Just add SAME_AS association concept A to concept C
This arguably creates ambiguity as A points to both B and C ... this can be resolved as only one target is an active concepts
Results in some added complexity
1.3) Add SAME_AS association concept A to concept C AND inactivate SAME_AS association concept A to concept B
The is a simpler solution for implementers
... But beware this only works in a specific situation see below.
1.4) Update SAME_AS association concept A to concept B so it now points to concept C
The is an even simpler solution for implementers but and avoids inactive row bloat as Peter says
... But beware this only works in this specific situation see below.
Part 2: Histories including POSSIBLY_EQUIVALENT
2010
Concept C is recognized as ambiguous.
Concepts D and E are created to represent to the two possible meanings.
Two POSSIBLY_EQUIVALENT associations are created one from C to D and one from C to E
Now there are many possibilities first we consider the simple case following through the same options as used in the first part of the scenario.
2.1) Nothing needs to be done.
Leave historical associations of A unchanged
Leave historical associations of B unchanged
In both cases rely on transitivity through the POSSIBLY_EQUIVALENT associations of C to D and E.
2.2) Just add the following 4 inferred POSSIBLY_EQUIVALENT associations
A to C, A to D, B to C and B to D.
2.3) Add the 4 inferred POSSIBLY_EQUIVALENT associations as in 2.2
Inactivate the associations B SAME_AS C and A SAME_AS C (note that A SAME_AS B was inactivated in 1.3)
2.4) It is impossible to fully replicate the 1.4 approach here. Therefore, the only logical approach for the new change would be the same step as in 2.3
Add the 4 inferred POSSIBLY_EQUIVALENT associations as in 2.2
Inactivate the active B SAME_AS C association
Opinion/Conclusion 1
In my view the fact that 2.4 is "impossible" suggests the approach 1.4 should be avoided as it leads to inconsistent approaches between the case where the chain involves a different type of historical association.
Part 3: Histories including POSSIBLY_EQUIVALENT - Additional Complexity
There are of course further issues of complexity that arise when considering this mixtures of historical associations.
Following 2.2 and 2.3 both assume complete transitivity of the ambiguity. However, it is possible that this is not the case. For example suppose the ambiguity of concept B or C resulted from something specific to the fully specified name or synonyms of those concepts. It is entirely possible that concept A did not share that ambiguity of concept C and unambiguously had the same meaning as D. In that case, the appropriate resolution might be:
3.1) Resolve A and B separately as follows:
Add A SAME_AS D
Inactivate A SAME_AS B
Note: This is necessary as it is now recognized A is NOT same as B (whereas in 1.1 and 2.1 this was not done as A SAME_AS B was still deemed to be true.
Leave associations of B unchanged
Transitivity resolves the history of B which is still regarded as SAME_AS C and consequently transitively POSSIBLE_EQUIVALENT of D and E.
3.2) Resolve A and B separately as follows:
Add A SAME_AS D
Inactivate A SAME_AS B
Inactivate A SAME_AS C
Note: Because of 1.2 approach there are two A SAME_AS ... associations that are incorrect and still active ... so both must be inactivated.
Add the following 2 inferred POSSIBLY_EQUIVALENT associations
B to C and B to D.
3.3) Resolve A and B separately as follows:
Add A SAME_AS D
Inactivate A SAME_AS C
Note: Note the A SAME_AS B was inactivated in previous release as per 1.3.
Add the following 2 inferred POSSIBLY_EQUIVALENT associations
B to C and B to D.
Inactivate the active B SAME_AS C association (created in 1.3).
3.4) It is impossible to replicate the 1.4 approach for B (see note on 2.4) but it is possible to apply this variant to A.
Change target of the A SAME_AS C association to D
For this reason I prefer approaches x.1 or x.2 as in these two cases historical associations that remain true are not inactivated. This may be significant when retrospective analysis of changes made or anomalies in reports (i.e. between the releases of 2004 and 2010). Similarly in cases were data is communicated between systems that have not upgraded to 2010 and those that have.
Of course, this presumes that when a concept is inactivated, consideration should be given to any historical associations that point to that concept from previously inactivated concepts is inactivated. I think that is a reasonable requirement.
I think option x.4 creates problems that are best avoided. Not only does it not work with combinations of different associations but also it obscures the original and still potentially correct association which may have been used in previous analyses. Hence the reason why I think the target of the association should be immutable.
Discussion
It is true that those wanting the in-between steps can refer to the full release. However, the downside of this is that those not using the full release will miss the validity of this change.
There are also implications for extensions if transitivity is no longer assumed (and/or if the previous historical associations are no longer active). Again good practice with dependencies should avoid this issue. However, are a range of impacts to be considered.
Finally, using the relationship analogy, it can be argued that the historical associations could be considered as a mix of 'stated' and 'inferred' associations (or even a kind of transitive closure). Making this explicit might assist people to understand what is going on.
My Conclusion
On balance, if we were starting from scratch at this point I would probably favour the x.2 approaches on the grounds that
a) this adds new stated associations
b) includes the transitive associations that are considered true
c) explicitly inactivates associations that are no longer considered true
Status | Initial Discussion |
|---|
Team
Child Pages
Relevant Documents
Copyright © 2025, SNOMED International