The tiledbsoma module¶
SOMA powered by TileDB
SOMA — stack of matrices, annotated — is a flexible, extensible, and
open-source API enabling access to data in a variety of formats, and is
motivated by use cases from single-cell biology. The tiledbsoma
Python package is an implementation of SOMA using the
TileDB Embedded engine.
Provides¶
The ability to store, query, and retrieve larger-than-core datasets, resident in both cloud (object-store) and local (file) systems.
A data model supporting dataframes, and both sparse and dense multi-dimensional arrays.
An extended data model with support for single-cell biology data.
See the SOMA GitHub repo for more information on the SOMA project.
Using the documentation¶
Documentation is also available via the Python builtin help
function. We
recommend exploring the package. For example:
>>> import tiledbsoma
>>> help(tiledbsoma.DataFrame)
Data types¶
The principal persistent types provided by SOMA are:
Collection
– a string-keyed container of SOMA objects.DataFrame
– a multi-column table with a user-defined schema, defining the number of columns and their respective column name and value type.SparseNDArray
– a sparse multi-dimensional array, storing Arrow primitive data types, i.e., int, float, etc.DenseNDArray
– a dense multi-dimensional array, storing Arrow primitive data types, i.e., int, float, etc.Experiment
– a specializedCollection
, representing an annotated 2-D matrix of measurements.Measurement
– a specializedCollection
, for use within theExperiment
class, representing a set of measurements on a single set of variables (features, e.g., genes)
SOMA Experiment
and Measurement
are inspired by use cases from
single-cell biology.
SOMA uses the Arrow type
system and memory model for its in-memory type system and schema. For
example, the schema of a DataFrame
is expressed as an
Arrow Schema.
Error handling¶
Most errors will be signaled with a raised Exception. Of note:
NotImplementedError
will be raised when the requested function or method is unsupported.SOMAError
is a base class for all SOMA-specific errors.
Most errors will raise an appropriate Python error, e.g., :TypeError
or
ValueError
.
Classes¶
- tiledbsoma.Collection
- tiledbsoma.Experiment
- tiledbsoma.Measurement
- tiledbsoma.DataFrame
- tiledbsoma.SparseNDArray
- tiledbsoma.SparseNDArrayRead
- tiledbsoma.DenseNDArray
- tiledbsoma.Axis
- tiledbsoma.CoordinateSpace
- tiledbsoma.MultiscaleImage
- tiledbsoma.PointCloudDataFrame
- tiledbsoma.Scene
- tiledbsoma.AffineTransform
- tiledbsoma.IdentityTransform
- tiledbsoma.ScaleTransform
- tiledbsoma.UniformScaleTransform
- tiledbsoma.ResultOrder
- tiledbsoma.AxisColumnNames
- tiledbsoma.AxisQuery
- tiledbsoma.ExperimentAxisQuery
- tiledbsoma.SOMATileDBContext
- tiledbsoma.TileDBCreateOptions
- tiledbsoma.TileDBWriteOptions
- tiledbsoma.IntIndexer
Exceptions¶
Base error type for SOMA-specific exceptions. |
|
Raised when attempting to open a non-existent or inaccessible SOMA object. |
|
Raised when attempting to create an already existing SOMA object. |
|
Raised when attempting to create an already existing SOMA object. |
Functions¶
Opens a TileDB SOMA object. |
|
Nominal use is for bug reports, so issue filers and issue fixers can be on the same page. |
|
Returns semver-compatible version of the supported SOMA API. |
|
Returns the implementation name, e.g., "python-tiledb". |
|
Returns the package implementation version as a semver. |
|
Returns underlying storage engine name, e.g., "tiledb". |
|
Enable TileDB internal statistics. |
|
Disable TileDB internal statistics. |
|
Reset all TileDB internal statistics to 0. |
|
Print TileDB internal statistics. |
|
Returns tiledbsoma stats as a Python dict. |
|
Returns tiledbsoma stats as a JSON string. |
|
Initialize re-indexer for provided indices (deprecated). |