tiledbsoma.SparseNDArray

class tiledbsoma.SparseNDArray(handle: _WrapperType_co | DataFrameWrapper | DenseNDArrayWrapper | SparseNDArrayWrapper, *, _dont_call_this_use_create_or_open_instead: str = 'unset')

SparseNDArray is a sparse, N-dimensional array, with offset (zero-based) integer indexing on each dimension. SparseNDArray has a user-defined schema, which includes:

  • The element type, expressed as an Arrow type, indicating the type of data contained within the array.

  • The shape of the array, i.e., the number of dimensions and the length of each dimension.

All dimensions must have a positive, non-zero length, and there must be 1 or more dimensions. Implicitly stored elements (i.e., those not explicitly stored in the array) are assumed to have a value of zero.

Where explicitly referenced in the API, the dimensions are named soma_dim_N, where N is the dimension number (e.g., soma_dim_0), and elements are named soma_data.

Lifecycle

Maturing.

Examples

>>> import tiledbsoma
>>> import pyarrow as pa
>>> import numpy as np
>>> import scipy.sparse
>>> with tiledbsoma.SparseNDArray.create(
...     "./test_sparse_ndarray", type=pa.float32(), shape=(1000, 100)
... ) as arr:
...     data = pa.SparseCOOTensor.from_scipy(
...         scipy.sparse.random(1000, 100, format="coo", dtype=np.float32)
...     )
...     arr.write(data)
... with tiledbsoma.SparseNDArray.open("./test_sparse_ndarray") as arr:
...     print(arr.schema)
...     print('---')
...     print(arr.read().coos().concat())
...
soma_dim_0: int64
soma_dim_1: int64
soma_data: float
---
<pyarrow.SparseCOOTensor>
type: float
shape: (1000, 100)
__init__(handle: _WrapperType_co | DataFrameWrapper | DenseNDArrayWrapper | SparseNDArrayWrapper, *, _dont_call_this_use_create_or_open_instead: str = 'unset')

Internal-only common initializer steps.

This function is internal; users should open TileDB SOMA objects using the create() and open() factory class methods.

Methods

__init__(handle, *[, ...])

Internal-only common initializer steps.

exists(uri[, context, tiledb_timestamp])

Finds whether an object of this type exists at the given URI.

create(uri, *, type, shape[, ...])

Creates a SOMA NDArray at the given URI.

open(uri[, mode, tiledb_timestamp, context, ...])

Opens this specific type of SOMA object.

reopen(mode[, tiledb_timestamp])

Return a new copy of the SOMAObject with the given mode at the current Unix timestamp.

close()

Release any resources held while the object is open.

read([coords, result_order, batch_size, ...])

Reads a user-defined slice of the SparseNDArray.

write(values, *[, platform_config])

Writes an Arrow object to the SparseNDArray.

verify_open_for_writing()

Raises an error if the object is not open for writing.

non_empty_domain()

Retrieves the non-empty domain for each dimension, namely the smallest and largest indices in each dimension for which the array/dataframe has data occupied.

tiledbsoma_upgrade_shape(newshape[, check_only])

Allows the array to have a resizeable shape as described in the TileDB-SOMA 1.15 release notes.

resize(newshape[, check_only])

Increases the shape of the array as specfied.

config_options_from_schema()

Returns metadata about the array that is not encompassed within the Arrow Schema, in the form of a PlatformConfig (deprecated).

Attributes

uri

Accessor for the object's storage URI.

soma_type

A string describing the SOMA type of this object.

schema

Returns data schema, in the form of an Arrow Schema.

is_sparse

True if the array is sparse, False if it is dense.

ndim

The number of dimensions in this array.

nnz

The number of stored values in the array, including explicitly stored zeros.

shape

Returns capacity of each dimension, always a list of length ndim.

maxshape

Returns the maximum resizable capacity of each dimension, always a list of length ndim.

tiledbsoma_has_upgraded_shape

Returns true if the array has the upgraded resizeable shape feature from TileDB-SOMA 1.15: the array was created with this support, or it has had .tiledbsoma_upgrade_shape applied to it.

mode

The mode this object was opened in, either r or w.

closed

True if the object has been closed.

context

A value storing implementation-specific configuration information.

tiledb_timestamp

The time that this object was opened in UTC.

tiledb_timestamp_ms

The time this object was opened, as millis since the Unix epoch.

metadata

The metadata of this SOMA object.