Jupyter Notebook

SpatialΒΆ

Note

Please see:

This page is for now just a stub.

# !pip install 'lamindb[jupyter,bionty]'
!lamin init --storage ./test-spatial --schema bionty
Hide code cell output
πŸ’‘ connected lamindb: testuser1/test-spatial
import lamindb as ln
import bionty as bt
import matplotlib.pyplot as plt
import scanpy as sc
Hide code cell output
πŸ’‘ connected lamindb: testuser1/test-spatial
ln.settings.transform.stem_uid = "daeFs3PkquDW"
ln.settings.transform.version = "draft"
ln.track()
πŸ’‘ notebook imports: bionty==0.47.1 lamindb==0.75.0 matplotlib==3.9.0 scanpy==1.9.6
πŸ’‘ saved: Transform(uid='daeFs3PkquDW7a3V', version='draft', name='Spatial', key='spatial', type='notebook', created_by_id=1, updated_at='2024-08-05 13:27:50 UTC')
πŸ’‘ saved: Run(uid='uBzalHUVG0kMvaXxGKfX', transform_id=1, created_by_id=1)
Run(uid='uBzalHUVG0kMvaXxGKfX', started_at='2024-08-05 13:27:50 UTC', is_consecutive=True, transform_id=1, created_by_id=1)

An example spatial datasetΒΆ

Here, we have a spatial gene expression dataset measured using Visium from Suo22.

This collection contains two parts:

  1. a high-res image of a slice of fetal liver

  2. a single cell expression dataset in .h5ad

img_path = ln.core.datasets.file_tiff_suo22()
img = plt.imread(img_path)
plt.imshow(img)
plt.show()
_images/42e19325a0bdf0a25162edac62d522b7f4aeeb387ec4747fa70d903d17c6d257.png
adata = ln.core.datasets.anndata_suo22_Visium10X()
# subset to the same image
adata = adata[adata.obs["img_id"] == "F121_LP1_4LIV"].copy()
adata
AnnData object with n_obs Γ— n_vars = 3027 Γ— 191
    obs: 'in_tissue', 'array_row', 'array_col', 'sample', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'mt_frac', 'img_id', 'EXP_id', 'Organ', 'Fetal_id', 'SN', 'Visium_Area_id', 'Age_PCW', 'Digestion time', 'paths', 'sample_id', '_scvi_batch', '_scvi_labels', '_indices', 'total_cell_abundance'
    var: 'feature_types', 'genome', 'SYMBOL', 'mt'
    obsm: 'NMF', 'means_cell_abundance_w_sf', 'q05_cell_abundance_w_sf', 'q95_cell_abundance_w_sf', 'spatial', 'stds_cell_abundance_w_sf'
# plot where CD45+ leukocytes are in the slice
sc.pl.scatter(adata, "array_row", "array_col", color="ENSG00000081237")
_images/179081c565bc82cbb6d6b2d06d29e8f57d7ab0ea73263185e4985ff5e28aa665.png

Validate annotationsΒΆ

We’ll register the single-cell data and the image as a Collection.

curate = ln.Curate.from_anndata(adata, var_index=bt.Gene.ensembl_gene_id, categoricals={"sample": ln.ULabel.name}, organism="human")
Hide code cell output
βœ… added 1 record with Feature.name for columns: 'sample'
❗ 26 non-validated categories are not saved in Feature.name: ['Age_PCW', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'Fetal_id', 'n_genes_by_counts', '_scvi_batch', 'total_counts', 'Organ', 'EXP_id', 'total_cell_abundance', 'array_col', 'pct_counts_in_top_200_genes', '_scvi_labels', 'SN', 'log1p_total_counts', 'img_id', 'array_row', 'Digestion time', '_indices', 'sample_id', 'log1p_n_genes_by_counts', 'pct_counts_in_top_500_genes', 'mt_frac', 'paths', 'in_tissue', 'Visium_Area_id']!
      β†’ to lookup categories, use lookup().columns
      β†’ to save, run add_new_from_columns
βœ… added 191 records from public with Gene.ensembl_gene_id for var_index: 'ENSG00000002586', 'ENSG00000004468', 'ENSG00000004897', 'ENSG00000007312', 'ENSG00000008086', 'ENSG00000008128', 'ENSG00000010278', 'ENSG00000010610', 'ENSG00000012124', 'ENSG00000013725', 'ENSG00000019582', 'ENSG00000026508', 'ENSG00000039068', 'ENSG00000059758', 'ENSG00000062038', 'ENSG00000065883', 'ENSG00000066294', 'ENSG00000070831', 'ENSG00000071991', 'ENSG00000073754', ...
curate.validate()
βœ… var_index is validated against Gene.ensembl_gene_id
πŸ’‘ mapping sample on ULabel.name
❗    1 terms is not validated: 'WSSS_F_IMMsp9838712'
      β†’ save terms via .add_new_from('sample')
False
curate.add_new_from('sample')
βœ… added 1 record with ULabel.name for sample: 'WSSS_F_IMMsp9838712'
curate.validate()
βœ… var_index is validated against Gene.ensembl_gene_id
βœ… sample is validated against ULabel.name
True

Register curated artifactΒΆ

artifact_ad = curate.save_artifact(description="Suo22 Visium10X image F121_LP1_4LIV")
Hide code cell output
πŸ’‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/FIpG2yB33beie1vaNmON.h5ad')
βœ… storing artifact 'FIpG2yB33beie1vaNmON' at '/home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/FIpG2yB33beie1vaNmON.h5ad'
πŸ’‘ parsing feature names of X stored in slot 'var'
βœ…    191 terms (100.00%) are validated for ensembl_gene_id
βœ…    linked: FeatureSet(uid='O5p7uvaNtbpIN16KEvEy', n=191, dtype='float', registry='bionty.Gene', hash='f29u0HJ47KiqdYQuuhNzeQ', created_by_id=1, run_id=1)
πŸ’‘ parsing feature names of slot 'obs'
βœ…    1 term (3.70%) is validated for name
❗    26 terms (96.30%) are not validated for name: in_tissue, array_row, array_col, n_genes_by_counts, log1p_n_genes_by_counts, total_counts, log1p_total_counts, pct_counts_in_top_50_genes, pct_counts_in_top_100_genes, pct_counts_in_top_200_genes, pct_counts_in_top_500_genes, mt_frac, img_id, EXP_id, Organ, Fetal_id, SN, Visium_Area_id, Age_PCW, Digestion time, ...
βœ…    linked: FeatureSet(uid='2Zzo8YGfMKzws4rGGdGe', n=1, registry='Feature', hash='AbbaEs_catrIflDPn6zPgA', created_by_id=1, run_id=1)
βœ… saved 2 feature sets for slots: 'var','obs'
artifact_ad.describe()
Artifact(uid='FIpG2yB33beie1vaNmON', description='Suo22 Visium10X image F121_LP1_4LIV', suffix='.h5ad', type='dataset', _accessor='AnnData', size=9743793, hash='MRyvckic_gbrV_hHpHOlAQ', _hash_type='md5', n_observations=3027, visibility=1, _key_is_virtual=True, updated_at='2024-08-05 13:28:10 UTC')
  Provenance
    .created_by = 'testuser1'
    .storage = '/home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial'
    .transform = 'Spatial'
    .run = '2024-08-05 13:27:50 UTC'
  Labels
    .ulabels = 'WSSS_F_IMMsp9838712'
  Features
    'sample' = 'WSSS_F_IMMsp9838712'
  Feature sets
    'var' = 'CD99', 'CD38', 'CDC27', 'CD79B', 'CDKL5', 'CDK11A', 'CD9', 'CD4', 'CD22', 'CD6', 'CD74', 'CD44', 'CDH1', 'CDK17', 'CDH3', 'CDK13', 'CD84', 'CDC42', 'CDH19', 'CD5L'
    'obs' = 'sample'

Register a collectionΒΆ

artifact_img = ln.Artifact(img_path, description="Suo22 image F121_LP1_4LIV")
artifact_img.save()
Hide code cell output
πŸ’‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/7AP1cdIMEC6uLtxe3txK.tiff')
βœ… storing artifact '7AP1cdIMEC6uLtxe3txK' at '/home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/7AP1cdIMEC6uLtxe3txK.tiff'
Artifact(uid='7AP1cdIMEC6uLtxe3txK', description='Suo22 image F121_LP1_4LIV', suffix='.tiff', size=119764004, hash='ZAnyai4Ys01P2fLR_aDIvq', _hash_type='sha1-fl', visibility=1, _key_is_virtual=True, created_by_id=1, storage_id=1, transform_id=1, run_id=1, updated_at='2024-08-05 13:28:10 UTC')
collection = ln.Collection([artifact_ad, artifact_img], name="Suo22")
collection.save()
Collection(uid='bXl2owreXlfov4mslR1Q', name='Suo22', hash='8BBgP1aPBOFv9jCebMhGdw', visibility=1, created_by_id=1, transform_id=1, run_id=1, updated_at='2024-08-05 13:28:10 UTC')
# clean up test instance
!lamin delete --force test-spatial
!rm -r test-flow
Hide code cell output
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.14/x64/bin/lamin", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
    return super().__call__(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 164, in delete
    return delete(instance, force=force)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 98, in delete
    n_objects = check_storage_is_empty(
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 779, in check_storage_is_empty
    raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb contains 2 objects ('_is_initialized' ignored) - delete them prior to deleting the instance
rm: cannot remove 'test-flow': No such file or directory