bep

Add Equivalence Relation

Abstract

The purpose of this BEP is to add an explicit equivalence relation (equivalentTo/eq) to the BEL language for scenarios when equivalence is solely because of the identifier/function pair (e.g., g(HGNC:AKT1) eq g(ENTREZ:207)) and more complicated scenarios (e.g., a(CHEBI:"amyloid-beta polypeptide 40") eq p(HGNC:APP, frag(672_711))).

Preamble

BEP-Id: BEP-0004

Status: Published

Version: 1

BEL-Version: 2.0.0+

Authors: Charlie Hoyt ([email protected])

Created-Date: 2018-07-13 Approved-Date: 2018-11-14

Type: Standards Track

Rationale and Goals

While the equivalencing system from the BEL Framework allowed identifiers to be considered as equivalent, it was not explicitly part of the language. Adding an equivalence relationship will have the goal of explicitly representing that two entities are equivalent, whether that equivalence is sufficiently captured by the combination of an identifier/function pair, or also the addition of modifiers.

Use Cases

Case 1: Complex definitions of equivalence

a(CHEBI:"amyloid-beta polypeptide 40") eq p(HGNC:APP, frag(672_711))

Several modifications of proteins and other physical entities have more commonly used names, and it might even be the case that they have different abundances as in this example.

From dbSNP: g(HGNC:MDGA2, var(c.*1638C>A)) equivalentTo g(dbSNP:rs1235)

Case 2: Equivalence between named complexes and enumerated complexes

Though some definitions of named complexes can be included in “backbone” BEL documents with several statements like:

complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3B1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3D1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3S1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3S2)

The direct connection between the named complex, and a complex defined as the enumeration of its components can now be defined explicitly using the eq relation:

complex(FPLX:"Adaptor_protein_III") eq complex(p(HGNC:AP3B1), p(HGNC:AP3D1), p(HGNC:AP3S1), p(HGNC:AP3S2))

Case 3: Genetic modification equivalences

Single nucleotide polymorphism (SNP) databases like dbSNP and SNPedia link given SNPs to their reference sequence and their consequences. For example, the HTHFR variant, C677T (see: https://www.snpedia.com/index.php/Rs1801133) has been linked to several diseases. Since, in BEL, this can be written as either a gene with a substitution in HGVS nomenclature or directly with the database identifier, using the following eq relation can link various statements for each together:

g(DBSNP:rs1801133) eq g(HGNC:MTHFR, var("g.677C>T""))

Case 4: Simple correspondence between two identifiers for the same gene:

g(HGNC:AKT1) eq g(ENTREZ:207)

It is not enough to show that HGNC:AKT1 is equivalent to ENTREZ:207, since both identifiers can be used with a geneAbundance(), rnaAbundance(), or proteinAbundance().

NOTE: This is not a recommended approach to handle equivalences between terminologies. This will lead to a large explosion of edges in a graph representing this knowledge. Handling this type of equivalencing is best managed through a terminology service that represents each equivalence set with a preferred (canonical) form or a primary key for each equivalence set.

Discussion

Specification

The equivalence relation will be used as any other predicate, where any BEL term can be used as the subject and object.

<BEL term 1> eq <BEL term 2>

As expected, equivalence is a two-way relation where A eq B implies B eq A, and visa-versa.

Backwards Compatibility

This proposed feature is for 2.0.0+ and will be added to the language so does not require any efforts to migrate older BEL knowledge.