The purpose of this BEP is to add an explicit equivalence relation (equivalentTo
/eq
) to the BEL language
for scenarios when equivalence is solely because of the identifier/function pair (e.g., g(HGNC:AKT1) eq g(ENTREZ:207)
)
and more complicated scenarios (e.g., a(CHEBI:"amyloid-beta polypeptide 40") eq p(HGNC:APP, frag(672_711))
).
BEP-Id: BEP-0004
Status: Published
Version: 1
BEL-Version: 2.0.0+
Authors: Charlie Hoyt ([email protected])
Created-Date: 2018-07-13 Approved-Date: 2018-11-14
Type: Standards Track
While the equivalencing system from the BEL Framework allowed identifiers to be considered as equivalent, it was not explicitly part of the language. Adding an equivalence relationship will have the goal of explicitly representing that two entities are equivalent, whether that equivalence is sufficiently captured by the combination of an identifier/function pair, or also the addition of modifiers.
a(CHEBI:"amyloid-beta polypeptide 40") eq p(HGNC:APP, frag(672_711))
Several modifications of proteins and other physical entities have more commonly used names, and it might even be the case that they have different abundances as in this example.
From dbSNP: g(HGNC:MDGA2, var(c.*1638C>A)) equivalentTo g(dbSNP:rs1235)
Though some definitions of named complexes can be included in “backbone” BEL documents with several statements like:
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3B1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3D1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3S1)
complex(FPLX:"Adaptor_protein_III") hasComponent p(HGNC:AP3S2)
The direct connection between the named complex, and a complex defined as the enumeration of its components can now
be defined explicitly using the eq
relation:
complex(FPLX:"Adaptor_protein_III") eq complex(p(HGNC:AP3B1), p(HGNC:AP3D1), p(HGNC:AP3S1), p(HGNC:AP3S2))
Single nucleotide polymorphism (SNP) databases like dbSNP and SNPedia link given SNPs to their reference sequence and
their consequences. For example, the HTHFR variant, C677T (see: https://www.snpedia.com/index.php/Rs1801133) has been
linked to several diseases. Since, in BEL, this can be written as either a gene with a substitution in
HGVS nomenclature or directly with the database identifier, using the following eq
relation can link various
statements for each together:
g(DBSNP:rs1801133) eq g(HGNC:MTHFR, var("g.677C>T""))
g(HGNC:AKT1) eq g(ENTREZ:207)
It is not enough to show that HGNC:AKT1 is equivalent to ENTREZ:207, since both identifiers can be used with a
geneAbundance()
, rnaAbundance()
, or proteinAbundance()
.
NOTE: This is not a recommended approach to handle equivalences between terminologies. This will lead to a large explosion of edges in a graph representing this knowledge. Handling this type of equivalencing is best managed through a terminology service that represents each equivalence set with a preferred (canonical) form or a primary key for each equivalence set.
The equivalence relation will be used as any other predicate, where any BEL term can be used as the subject and object.
<BEL term 1> eq <BEL term 2>
As expected, equivalence is a two-way relation where A eq B
implies B eq A
, and visa-versa.
This proposed feature is for 2.0.0+ and will be added to the language so does not require any efforts to migrate older BEL knowledge.