tiledbsoma.DataFrame.read¶
- DataFrame.read(coords: Sequence[None | bytes | Slice[bytes] | Sequence[bytes] | float | Slice[float] | Sequence[float] | int | Slice[int] | Sequence[int] | slice | Slice[slice] | Sequence[slice] | str | Slice[str] | Sequence[str] | datetime64 | Slice[datetime64] | Sequence[datetime64] | TimestampType | Slice[TimestampType] | Sequence[TimestampType] | Array | ChunkedArray | ndarray[Any, dtype[integer]] | ndarray[Any, dtype[datetime64]]] = (), column_names: Sequence[str] | None = None, *, result_order: ResultOrder | Literal['auto', 'row-major', 'column-major'] = ResultOrder.AUTO, value_filter: str | None = None, batch_size: BatchSize = BatchSize(count=None, bytes=None), partitions: ReadPartitions | None = None, platform_config: Dict[str, Mapping[str, Any]] | object | None = None) TableReadIter ¶
Reads a user-defined subset of data, addressed by the dataframe indexing columns, optionally filtered, and return results as one or more Arrow tables.
- Parameters:
coords – For each index dimension, which rows to read. Defaults to
None
, meaning no constraint – all IDs.column_names – The named columns to read and return. Defaults to
None
, meaning no constraint – all column names.result_order – Order of read results. This can be one of ‘row-major’, ‘col-major’, or ‘auto’.
value_filter – An optional [value filter] to apply to the results. Defaults to no filter.
partitions – An optional
ReadPartitions
hint to indicate how results should be organized.
- Returns:
A
TableReadIter
that can be used to iterate through the result set.- Raises:
SOMAError – If
value_filter
can not be parsed.ValueError – If
coords
are malformed or do not index this DataFrame.SOMAError – If the object is not open for reading.
Notes
The
coords
parameter will support, per dimension: a list of values of the type of the indexed column.Acceptable ways to index:
A sequence of coordinates is accepted, one per dimension.
Sequence length must be <= number of dimensions.
If the sequence contains missing coordinates (length less than number of dimensions), then
slice(None)
– i.e. no constraint – is assumed for the missing dimensions.Per-dimension, explicitly specified coordinates can be one of: None, a value, a list/
numpy.ndarray
/pyarrow.Array
/etc of values, a slice, etc.Slices are doubly inclusive:
slice(2,4)
means [2,3,4] not [2,3]. Slice steps are not supported. Slices can beslice(None)
, meaning select all in that dimension, and may be half-specified, e.g.slice(2,None)
orslice(None,4)
.Negative indexing is unsupported.
Lifecycle
Maturing.