{ "cells": [ { "cell_type": "markdown", "id": "2b8e72a7-129c-422c-b955-350fb9ee0541", "metadata": { "tags": [] }, "source": [ "# Tutorial: SOMA Experiment queries" ] }, { "cell_type": "code", "execution_count": 1, "id": "3a5fd5d3", "metadata": { "tags": [] }, "outputs": [], "source": [ "import tiledbsoma as soma" ] }, { "cell_type": "markdown", "id": "ccc8709a", "metadata": { "tags": [] }, "source": [ "In this notebook, we'll take a quick look at the SOMA experiment-query API. The dataset used is from Peripheral Blood Mononuclear Cells (PBMC), which is freely available from 10X Genomics.\n" ] }, { "cell_type": "markdown", "id": "2472cd1a-2d49-4268-9b9b-1bed49ccfa1b", "metadata": { "tags": [] }, "source": [ "First we'll unpack and open the experiment:" ] }, { "cell_type": "code", "execution_count": 2, "id": "c70b2d82-2012-481c-a7a6-5b574de69241", "metadata": { "tags": [] }, "outputs": [], "source": [ "import tarfile\n", "import tempfile\n", "\n", "sparse_uri = tempfile.mktemp()\n", "with tarfile.open(\"data/pbmc3k-sparse.tgz\") as handle:\n", " handle.extractall(sparse_uri)\n", "exp = soma.Experiment.open(sparse_uri)" ] }, { "cell_type": "markdown", "id": "fab7898c", "metadata": { "tags": [] }, "source": [ "Using the keys of the `obs` dataframe, we can see what fields are available to query on." ] }, { "cell_type": "code", "execution_count": 3, "id": "d67dfbc6-0382-4acc-8c56-3670549654f8", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "('soma_joinid', 'obs_id', 'n_genes', 'percent_mito', 'n_counts', 'louvain')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "exp.obs.keys()" ] }, { "cell_type": "code", "execution_count": 4, "id": "9e4ede09-2303-4c21-92c1-bf42ed4e7dd1", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", " | louvain | \n", "
---|---|
0 | \n", "CD4 T cells | \n", "
1 | \n", "B cells | \n", "
2 | \n", "CD4 T cells | \n", "
3 | \n", "CD14+ Monocytes | \n", "
4 | \n", "NK cells | \n", "
... | \n", "... | \n", "
2633 | \n", "CD14+ Monocytes | \n", "
2634 | \n", "B cells | \n", "
2635 | \n", "B cells | \n", "
2636 | \n", "B cells | \n", "
2637 | \n", "CD4 T cells | \n", "
2638 rows × 1 columns
\n", "\n", " | soma_dim_0 | \n", "soma_dim_1 | \n", "soma_data | \n", "
---|---|---|---|
0 | \n", "1 | \n", "0 | \n", "-0.214582 | \n", "
1 | \n", "1 | \n", "1 | \n", "-0.372653 | \n", "
2 | \n", "1 | \n", "2 | \n", "-0.054804 | \n", "
3 | \n", "1 | \n", "3 | \n", "-0.683391 | \n", "
4 | \n", "1 | \n", "4 | \n", "0.633951 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
911643 | \n", "2636 | \n", "1833 | \n", "-0.149789 | \n", "
911644 | \n", "2636 | \n", "1834 | \n", "-0.325824 | \n", "
911645 | \n", "2636 | \n", "1835 | \n", "-0.005918 | \n", "
911646 | \n", "2636 | \n", "1836 | \n", "-0.135213 | \n", "
911647 | \n", "2636 | \n", "1837 | \n", "-0.482111 | \n", "
911648 rows × 3 columns
\n", "