Tissue¶
lamindb provides access to the following public Tissue ontologies through bionty:
Here we show how to access and search Tissue ontologies to standardize new data.
import bionty as bt
import pandas as pd
PublicOntology objects¶
Let us create a public ontology accessor with .public
method, which chooses a default public ontology source from Source
.
It’s a PublicOntology object, which you can think about as a public registry:
tissues = bt.Tissue.public(organism="all")
tissues
💡 connected lamindb: testuser1/test-public-ontologies
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2024-02-20
#terms: 15567
As for registries, you can export the ontology as a DataFrame
:
df = tissues.df()
df.head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0000000 | processual entity | An Occurrent [Span:Occurrent] That Exists In T... | None | [] |
UBERON:0000002 | uterine cervix | Lower, Narrow Portion Of The Uterus Where It J... | uterine cervix|neck of uterus|cervical canal o... | [UBERON:0001560, UBERON:0005156] |
UBERON:0000003 | naris | Orifice Of The Olfactory System. The Naris Is ... | None | [UBERON:0000161] |
UBERON:0000004 | nose | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|peripheral olfactory organ|nose | [UBERON:0002268, UBERON:0004121, UBERON:0000475] |
UBERON:0000005 | chemosensory organ | None | chemosensory sensory organ | [UBERON:0000020] |
Unlike registries, you can also export it as a Pronto object via public.ontology
.
Look up terms¶
As for registries, terms can be looked up with auto-complete:
lookup = tissues.lookup()
The .
accessor provides normalized terms (lower case, only contains alphanumeric characters and underscores):
lookup.alveolus_of_lung
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='alveolus pulmonis|pulmonary alveolus|respiratory alveolus|lung alveolus', parents=array(['UBERON:0004119', 'UBERON:0003215'], dtype=object))
To look up the exact original strings, convert the lookup object to dict and use the []
accessor:
lookup_dict = lookup.dict()
lookup_dict["alveolus of lung"]
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='alveolus pulmonis|pulmonary alveolus|respiratory alveolus|lung alveolus', parents=array(['UBERON:0004119', 'UBERON:0003215'], dtype=object))
By default, the name
field is used to generate lookup keys. You can specify another field to look up:
lookup = tissues.lookup(tissues.ontology_id)
lookup.uberon_0000031
Tissue(ontology_id='UBERON:0000031', name='lamina propria of trachea', definition='A Lamina Propria That Is Part Of A Respiratory Airway.', synonyms='trachea lamina propria mucosa|trachea lamina propria|tracheal lamina propria|lamina propria mucosae of trachea|lamina propria mucosae of windpipe|trachea lamina propria mucosae|windpipe lamina propria mucosae|windpipe lamina propria|lamina propria mucosa of windpipe|lamina propria of windpipe|windpipe lamina propria mucosa|lamina propria mucosa of trachea', parents=array(['UBERON:0004779'], dtype=object))
Search terms¶
Search behaves in the same way as it does for registries:
tissues.search("alveolus lung").head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
alveolus of lung | UBERON:0002299 | Spherical Outcropping Of The Respiratory Bronc... | alveolus pulmonis|pulmonary alveolus|respirato... | [UBERON:0004119, UBERON:0003215] | 89.655172 |
left lung alveolus | UBERON:0004862 | An Alveolus That Is Part Of A Left Lung [Autom... | alveolus of lobe of left lung|alveolus of left... | [UBERON:0002299] | 76.470588 |
alveolus | UBERON:0003215 | Organ Part That Has The Form Of A Hollow Cavit... | None | [UBERON:0000064] | 76.190476 |
By default, search also covers synonyms:
tissues.search("nasal sac").head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
nose | UBERON:0000004 | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|peripheral olfactory organ|nose | [UBERON:0002268, UBERON:0004121, UBERON:0000475] | 100.000000 |
anal sac | UBERON:0008978 | In Carnivores, Either Of Two Sacs Found Betwee... | None | [UBERON:0000062, UBERON:0009856] | 82.352941 |
nasal air sac | UBERON:0013175 | An Air Sac Opening Into The Passage Of The Blo... | blowhole air sac | [UBERON:0004111] | 81.818182 |
You can turn this off synonym by passing synonyms_field=None
:
tissues.search("nasal sac", synonyms_field=None).head(3)
ontology_id | definition | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
name | |||||
anal sac | UBERON:0008978 | In Carnivores, Either Of Two Sacs Found Betwee... | None | [UBERON:0000062, UBERON:0009856] | 82.352941 |
nasal air sac | UBERON:0013175 | An Air Sac Opening Into The Passage Of The Blo... | blowhole air sac | [UBERON:0004111] | 81.818182 |
nasal muscle | UBERON:0008522 | Any Muscle Organ That Is Part Of An Nose. | muscle of nose | [UBERON:0001577, UBERON:0004121] | 76.190476 |
Search another field (default is .name
):
tissues.search(
"alveolus in the lung",
field=tissues.definition,
).head()
ontology_id | name | synonyms | parents | __ratio__ | |
---|---|---|---|---|---|
definition | |||||
An Alveolus That Is Part Of A Left Lung [Automatically Generated Definition]. | UBERON:0004862 | left lung alveolus | alveolus of lobe of left lung|alveolus of left... | [UBERON:0002299] | 78.048780 |
The Epithelial Layer Of The Alveoli[Mp]. The Layer Of Cells Covering The Lining Of The Tiny Air Sacs At The End Of The Bronchioles[Bto]. | UBERON:0004821 | pulmonary alveolus epithelium | epithelial tissue of alveolus|alveolus epithel... | [UBERON:0000487, UBERON:0000115] | 76.923077 |
An Alveolus That Is Part Of A Right Lung [Automatically Generated Definition]. | UBERON:0004861 | right lung alveolus | alveolus of right lung | [UBERON:0002299] | 76.190476 |
An Alveolar Duct That Is Part Of A Left Lung [Automatically Generated Definition]. | UBERON:0003537 | left lung alveolar duct | alveolar duct of left lung | [UBERON:0002173] | 65.217391 |
An Alveolar Duct That Is Part Of A Right Lung [Automatically Generated Definition]. | UBERON:0003536 | right lung alveolar duct | alveolar duct of right lung | [UBERON:0002173] | 63.829787 |
Standardize Tissue identifiers¶
Let us generate a DataFrame
that stores a number of Tissue identifiers, some of which corrupted:
df_orig = pd.DataFrame(
index=[
"UBERON:0000000",
"UBERON:0000005",
"UBERON:0000001",
"UBERON:0000002",
"This tissue does not exist",
]
)
df_orig
UBERON:0000000 |
---|
UBERON:0000005 |
UBERON:0000001 |
UBERON:0000002 |
This tissue does not exist |
We can check whether any of our values are validated against the ontology reference:
validated = tissues.validate(df_orig.index, tissues.name)
df_orig.index[~validated]
❗ 5 terms (100.00%) are not validated: UBERON:0000000, UBERON:0000005, UBERON:0000001, UBERON:0000002, This tissue does not exist
Index(['UBERON:0000000', 'UBERON:0000005', 'UBERON:0000001', 'UBERON:0000002',
'This tissue does not exist'],
dtype='object')
Ontology source versions¶
For any given entity, we can choose from a number of versions:
bt.Tissue.list_source().df()
Show code cell output
uid | entity | organism | name | version | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
36 | Cwzj | bionty.Tissue | all | uberon | 2024-02-20 | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 2048667b5fdf93192384bdf53cafba18 | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.380844+00:00 |
37 | svSf | bionty.Tissue | all | uberon | 2023-09-05 | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | abcee3ede566d1311d758b853ccdf5aa | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.380946+00:00 |
38 | 1tLk | bionty.Tissue | all | uberon | 2023-04-19 | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 5611dd1375d5a95ac7d7de8e25e6016f | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.381048+00:00 |
39 | 6VAw | bionty.Tissue | all | uberon | 2023-02-14 | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 3f94e22fae4cdde88a555c5cd59c47da | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.381150+00:00 |
40 | 7Iby | bionty.Tissue | all | uberon | 2022-08-19 | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | c7c958a1ee48fdce146f2c1763eed27e | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.381252+00:00 |
# only lists the sources that are currently used
bt.Tissue.list_source(currently_used=True).df()
uid | entity | organism | name | version | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
36 | Cwzj | bionty.Tissue | all | uberon | 2024-02-20 | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 2048667b5fdf93192384bdf53cafba18 | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.380844+00:00 |
When instantiating a Bionty object, we can choose a source or version:
source = bt.Source.filter(
name="uberon", version="2023-04-19", organism="all"
).one()
tissues= bt.Tissue.public(source=source)
tissues
❗ loading non-default source inside a LaminDB instance
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2023-04-19
#terms: 15499
The currently used ontologies can be displayed using:
bt.Source.filter(currently_used=True).df()
Show code cell output
uid | entity | organism | name | version | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
1 | 33TU | bionty.Organism | vertebrates | ensembl | release-112 | False | True | Ensembl | https://ftp.ensembl.org/pub/release-112/specie... | 0ec37e77f4bc2d0b0b47c6c62b9f122d | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.375733+00:00 |
6 | 6bbV | bionty.Organism | bacteria | ensembl | release-57 | False | True | Ensembl | https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... | ee28510ed5586ea7ab4495717c96efc8 | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.376768+00:00 |
7 | 6s9n | bionty.Organism | fungi | ensembl | release-57 | False | True | Ensembl | http://ftp.ensemblgenomes.org/pub/fungi/releas... | dbcde58f4396ab8b2480f7fe9f83df8a | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.376957+00:00 |
8 | 2PmT | bionty.Organism | metazoa | ensembl | release-57 | False | True | Ensembl | http://ftp.ensemblgenomes.org/pub/metazoa/rele... | 424636a574fec078a61cbdddb05f9132 | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.377100+00:00 |
9 | 7GPH | bionty.Organism | plants | ensembl | release-57 | False | True | Ensembl | https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... | eadaa1f3e527e4c3940c90c7fa5c8bf4 | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.377210+00:00 |
10 | 4tsk | bionty.Organism | all | ncbitaxon | 2023-06-20 | False | True | NCBItaxon Ontology | s3://bionty-assets/df_all__ncbitaxon__2023-06-... | 00d97ba65627f1cd65636d2df22ea76c | https://github.com/obophenotype/ncbitaxon | None | None | 1 | 2024-08-05 13:21:58.377317+00:00 |
11 | 4UGN | bionty.Gene | human | ensembl | release-112 | False | True | Ensembl | s3://bionty-assets/df_human__ensembl__release-... | 4ccda4d88720a326737376c534e8446b | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.377424+00:00 |
15 | 4r4f | bionty.Gene | mouse | ensembl | release-112 | False | True | Ensembl | s3://bionty-assets/df_mouse__ensembl__release-... | 519cf7b8acc3c948274f66f3155a3210 | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.377952+00:00 |
19 | 4RPA | bionty.Gene | saccharomyces cerevisiae | ensembl | release-112 | False | True | Ensembl | s3://bionty-assets/df_saccharomyces cerevisiae... | 11775126b101233525a0a9e2dd64edae | https://www.ensembl.org | None | None | 1 | 2024-08-05 13:21:58.378601+00:00 |
22 | 3EYy | bionty.Protein | human | uniprot | 2024-03 | False | True | Uniprot | s3://bionty-assets/df_human__uniprot__2024-03_... | b5b9e7645065b4b3187114f07e3f402f | https://www.uniprot.org | None | None | 1 | 2024-08-05 13:21:58.378914+00:00 |
25 | 01RW | bionty.Protein | mouse | uniprot | 2024-03 | False | True | Uniprot | s3://bionty-assets/df_mouse__uniprot__2024-03_... | b1b6a196eb853088d36198d8e3749ec4 | https://www.uniprot.org | None | None | 1 | 2024-08-05 13:21:58.379352+00:00 |
28 | 3kDh | bionty.CellMarker | human | cellmarker | 2.0 | False | True | CellMarker | s3://bionty-assets/human_cellmarker_2.0_CellMa... | d565d4a542a5c7e7a06255975358e4f4 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | None | None | 1 | 2024-08-05 13:21:58.379886+00:00 |
29 | 7bV5 | bionty.CellMarker | mouse | cellmarker | 2.0 | False | True | CellMarker | s3://bionty-assets/mouse_cellmarker_2.0_CellMa... | 189586732c63be949e40dfa6a3636105 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | None | None | 1 | 2024-08-05 13:21:58.380085+00:00 |
30 | 6LyR | bionty.CellLine | all | clo | 2022-03-21 | False | True | Cell Line Ontology | https://data.bioontology.org/ontologies/CLO/su... | ea58a1010b7e745702a8397a526b3a33 | https://bioportal.bioontology.org/ontologies/CLO | None | None | 1 | 2024-08-05 13:21:58.380229+00:00 |
31 | FxPV | bionty.CellType | all | cl | 2024-02-13 | False | True | Cell Ontology | http://purl.obolibrary.org/obo/cl/releases/202... | d6d962b58c48f372c2c98b71e0833242 | https://obophenotype.github.io/cell-ontology | None | None | 1 | 2024-08-05 13:21:58.380335+00:00 |
36 | Cwzj | bionty.Tissue | all | uberon | 2024-02-20 | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 2048667b5fdf93192384bdf53cafba18 | http://obophenotype.github.io/uberon | None | None | 1 | 2024-08-05 13:21:58.380844+00:00 |
41 | 5Xov | bionty.Disease | all | mondo | 2024-02-06 | False | True | Mondo Disease Ontology | http://purl.obolibrary.org/obo/mondo/releases/... | 78914fa236773c5ea6605f7570df6245 | https://mondo.monarchinitiative.org | None | None | 1 | 2024-08-05 13:21:58.381353+00:00 |
46 | 4Pd5 | bionty.Disease | human | doid | 2024-01-31 | False | True | Human Disease Ontology | http://purl.obolibrary.org/obo/doid/releases/2... | b36c15a4610757094f8db64b78ae2693 | https://disease-ontology.org | None | None | 1 | 2024-08-05 13:21:58.381857+00:00 |
53 | 5Fi2 | bionty.ExperimentalFactor | all | efo | 3.63.0 | False | True | The Experimental Factor Ontology | http://www.ebi.ac.uk/efo/releases/v3.63.0/efo.owl | 603e6f6981d53d501c5921aa3940b095 | https://bioportal.bioontology.org/ontologies/EFO | None | None | 1 | 2024-08-05 13:21:58.382569+00:00 |
56 | 3405 | bionty.Phenotype | human | hp | 2024-03-06 | False | True | Human Phenotype Ontology | https://github.com/obophenotype/human-phenotyp... | 36b0d00c24a68edb9131707bc146a4c7 | https://hpo.jax.org | None | None | 1 | 2024-08-05 13:21:58.382873+00:00 |
60 | 3oMT | bionty.Phenotype | mammalian | mp | 2024-02-07 | False | True | Mammalian Phenotype Ontology | https://github.com/mgijax/mammalian-phenotype-... | 31c27ed2c7d5774f8b20a77e4e1fd278 | https://github.com/mgijax/mammalian-phenotype-... | None | None | 1 | 2024-08-05 13:21:58.383288+00:00 |
62 | 2K58 | bionty.Phenotype | zebrafish | zp | 2024-01-22 | False | True | Zebrafish Phenotype Ontology | https://github.com/obophenotype/zebrafish-phen... | 01600a5d392419b27fc567362d4cfff8 | https://github.com/obophenotype/zebrafish-phen... | None | None | 1 | 2024-08-05 13:21:58.383489+00:00 |
65 | 3ox8 | bionty.Phenotype | all | pato | 2023-05-18 | False | True | Phenotype And Trait Ontology | http://purl.obolibrary.org/obo/pato/releases/2... | bd472f4971492109493d4ad8a779a8dd | https://github.com/pato-ontology/pato | None | None | 1 | 2024-08-05 13:21:58.386013+00:00 |
66 | 3RSX | bionty.Pathway | all | go | 2023-05-10 | False | True | Gene Ontology | https://data.bioontology.org/ontologies/GO/sub... | e9845499eadaef2418f464cd7e9ac92e | http://geneontology.org | None | None | 1 | 2024-08-05 13:21:58.386124+00:00 |
69 | 3rm9 | BFXPipeline | all | lamin | 1.0.0 | False | True | Bioinformatics Pipeline | s3://bionty-assets/bfxpipelines.json | a7eff57a256994692fba46e0199ffc94 | https://lamin.ai | None | None | 1 | 2024-08-05 13:21:58.386433+00:00 |
70 | 5alK | Drug | all | dron | 2024-03-02 | False | True | Drug Ontology | https://data.bioontology.org/ontologies/DRON/s... | 84138459de4f65034e979f4e46783747 | https://bioportal.bioontology.org/ontologies/DRON | None | None | 1 | 2024-08-05 13:21:58.386533+00:00 |
72 | 7Zm9 | bionty.DevelopmentalStage | human | hsapdv | 2020-03-10 | False | True | Human Developmental Stages | http://aber-owl.net/media/ontologies/HSAPDV/11... | 52181d59df84578ed69214a5cb614036 | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-08-05 13:21:58.386735+00:00 |
73 | 6vJm | bionty.DevelopmentalStage | mouse | mmusdv | 2020-03-10 | False | True | Mouse Developmental Stages | http://aber-owl.net/media/ontologies/MMUSDV/9/... | 5bef72395d853c7f65450e6c2a1fc653 | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-08-05 13:21:58.386837+00:00 |
74 | MJRq | bionty.Ethnicity | human | hancestro | 3.0 | False | True | Human Ancestry Ontology | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | None | None | 1 | 2024-08-05 13:21:58.386946+00:00 |
75 | 5JnV | BioSample | all | ncbi | 2023-09 | False | True | NCBI BioSample attributes | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | None | None | 1 | 2024-08-05 13:21:58.387049+00:00 |