Identifier TranslationΒΆ
One of waldo’s features is the ability to translate between different gene identifiers.
It knows about the following identifier types:
embl:cds
EMBL CDSensembl:peptide_id
ENSEMBL Peptide IDensembl:gene_id
ENSEMBL Gene IDensembl:transcript_id
ENSEMBL Transcript IDmgi:id
MGI IDmgi:symbol
MGI Symbolmgi:name
MGI Namerefseq:accession
RefSeq Accessionuniprot:name
Uniprot Nameuniprot:accession
Uniprot Accessionlocate:id
Locate IDhpa:id
Human Protein Atlas ID
The strings like ensembl:gene_id
are the ones used in the code.
Here is a simple example of how to translate Uniprot accessions to Uniprot names:
from waldo import translate
accessions = [
'P60709',
'P07437',
'Q9BQE3',
'Q9NY65',
]
for a in accessions:
n = translate(a, 'uniprot:accession', 'uniprot:name')
print('{} -> {}'.format(a, n))
Prints out:
P60709 -> ACTB_HUMAN
P07437 -> TBB5_HUMAN
Q9BQE3 -> TBA1C_HUMAN
Q9NY65 -> TBA8_HUMAN