bep

Allow Namespace Definition via Regular Expressions

Abstract

The purpose of this BEP is to introduce syntax for defining namespaces with regular expressions to mirror the syntax used for defining annotations with regular expressions. This will allow definitions from https://identifiers.org to be more quickly incorporated in to BEL, without the overhead of defining a BELNS file, but also with the caveat of a weaker semantic check.

Preamble

BEP-Id: BEP-0005

Status: Published

Version: 1

BEL-Version: 2.0.0+

Authors: Charles Tapley Hoyt ([email protected])

Created-Date: 2018-07-17

Type: Standards Track

Rationale and Goals

Most biological resources have both standardized identifiers and human-readable labels. While BEL curators and users will likely continue to use these labels, automatically generated resources would benefit greatly from the flexibility provided by defining entities with their standardized identifiers.

These are listed on https://identifiers.org. For example, the PFAM page contains the regular expression for its standardized identifiers (^PF\d{5}$)

Use Cases

A BEL document can be written using a mix of namespace definitions from several sources

DEFINE NAMESPACE PFAM AS PATTERN "^PF\d{5}$"
DEFINE NAMESPACE HGNC AS URL "http://bel.scai.fraunhofer.de/namespace/hgnc/hgnc-20180215.belns"

...

# Disambiguation: PF02171 is the "Piwi domain"
p(HGNC:PIWIL1) isA p(PFAM:PF02171)

...

Additionally, namespaces that are restrictively large to use as BEL namespaces (such as dbSNP) can be much more easily incorporated into BEL documents.

Discussion

Specification

Namespaces will be defined with the following syntax:

DEFINE NAMESPACE X AS PATTERN "Y"

where X refers to the keyword that will be used for the document and Y is a valid regular expression string representing the pattern of the identifiers.

It is reccomended to use a capitalized version of the namespace name from https://identifiers.org and also to acquire the regular expression from https://identifiers.org.

Example:

DEFINE NAMESPACE PFAM AS PATTERN "^PF\d{5}$"

Backwards Compatibility

This proposed feature is for 2.0.0+ and will be added to the language so does not require any efforts to migrate older BEL knowledge.