This BEP proposes additional syntax (e.g., DEFINE NAMESPACE HGNC AS MIRIAM
) for defining a namespace or annotation by regular expression via a lookup to its entry in the MIRIAM database.
BEP-Id: BEP-0009
Status: Published
Version: 1
BEL-Version: 2.0.0+
Authors: Charles Tapley Hoyt ([email protected])
Created-Date: 2019-08-01
Type: Standards Track
The BEL 2.0 standard already allowed annotations to be defined as a PATTERN
(i.e., regular expression).
Following the acceptance of BEP-0005, BEL 2.1 allowed namespaces to be defined the same way.
The MIRIAM database compiles the common databases and namespaces as well as the regular expression corresponding to valid identifiers in each. The goal of this BEP is to make use of the MIRIAM database to make definition of namespaces by regular expression more standard and more easy.
BEL frameworks can access the MIRIAM database through its API (e.g., https://registry.api.identifiers.org/restApi/namespaces/search/findByPrefix?prefix=hgnc) to look up the appropriate regular expression which will then be handled the same as any namespace defined by a regular expression.
Instead of defining the HGNC namespace using a regular expression like the following
DEFINE NAMESPACE HGNC AS PATTERN "^((HGNC|hgnc):)?\\d{1,5}$"
the proposed syntax would allow users to defer the lookup of the pattern to MIRIAM with
DEFINE NAMESPACE HGNC AS MIRIAM
.
ec-code
or the dot character that appears in pubchem.compound
? I would propose we check the MIRIAM registry and relax the allowed characters in keywords to at least allow dashes and dots. This will be taken care of with BEP-0008 in which the way namespaces and identifiers are handled.The syntax is similar to other namespace and annotation definitions. For namespaces, it looks like:
DEFINE NAMESPACE <keyword> AS MIRIAM
For annotations, it would look like:
DEFINE ANNOTATION <keyword> AS MIRIAM
No issues with backwards compatibility.
TBD