tiledbsoma.io.from_h5ad

tiledbsoma.io.from_h5ad(experiment_uri: str, input_path: str | ~pathlib.Path, measurement_name: str, *, context: ~tiledbsoma.options._soma_tiledb_context.SOMATileDBContext | None = None, platform_config: ~typing.Dict[str, ~typing.Mapping[str, ~typing.Any]] | object | None = None, obs_id_name: str = 'obs_id', var_id_name: str = 'var_id', X_layer_name: str = 'data', raw_X_layer_name: str = 'data', ingest_mode: ~typing.Literal['write', 'schema_only', 'resume'] = 'write', use_relative_uri: bool | None = None, X_kind: ~typing.Type[~tiledbsoma._sparse_nd_array.SparseNDArray] | ~typing.Type[~tiledbsoma._dense_nd_array.DenseNDArray] = <class 'tiledbsoma._sparse_nd_array.SparseNDArray'>, registration_mapping: ~tiledbsoma.io._registration.ambient_label_mappings.ExperimentAmbientLabelMapping | None = None, uns_keys: ~typing.Sequence[str] | None = None, additional_metadata: ~typing.Dict[str, bytes | float | int | str] | None = None) str

Reads an .h5ad file and writes it to an Experiment.

Measurement data is stored in a Measurement in the experiment’s ms field, with the key provided by measurement_name. Data elements are available at the standard fields (var, X, etc.). Unstructured data from uns is partially supported (structured arrays and non-numeric NDArrays are skipped), and is available at the measurement’s uns key (i.e., at your_experiment.ms[measurement_name]["uns"]).

Parameters:
  • experiment_uri – The experiment to create or update.

  • input_path – A path to an input H5AD file.

  • measurement_name – The name of the measurement to store data in.

  • context – Optional SOMATileDBContext containing storage parameters, etc.

  • platform_config – Platform-specific options used to create this array, provided in the form {"tiledb": {"create": {"sparse_nd_array_dim_zstd_level": 7}}}.

  • obs_id_name/var_id_name

    Which AnnData obs and var columns, respectively, to use for append mode.

    Values of this column will be used to decide which obs/var rows in appended inputs are distinct from the ones already stored, for the assignment of soma_joinid. If this column exists in the input data, as a named index or a non-index column name, it will be used. If this column doesn’t exist in the input data, and if the index is nameless or named index, that index will be given this name when written to the SOMA experiment’s obs / var.

    NOTE: it is not necessary for this column to be the index-column name in the input AnnData object’s obs/var.

  • X_layer_name – SOMA array name for the AnnData’s X matrix.

  • raw_X_layer_name – SOMA array name for the AnnData’s raw/X matrix.

  • ingest_mode

    The ingestion type to perform:

    • write: Writes all data, creating new layers if the SOMA already exists.

    • resume: Adds data to an existing SOMA, skipping writing data that was previously written. Useful for continuing after a partial or interrupted ingestion operation.

    • schema_only: Creates groups and the array schema, without writing any data to the array. Useful to prepare for appending multiple H5AD files to a single SOMA.

  • X_kind – Which type of matrix is used to store dense X data from the H5AD file: DenseNDArray or SparseNDArray.

  • registration_mapping

    Does not need to be supplied when ingesting a single H5AD/AnnData object into a single Experiment. When multiple inputs are to be ingested into a single experiment, there are two steps. First:

    import tiledbsoma.io
    rd = tiledbsoma.io.register_h5ads(
        experiment_uri,
        h5ad_file_names,
        measurement_name="RNA",
        obs_field_name="obs_id",
        var_field_name="var_id",
        context=context,
    )
    

    Once that’s been done, the data ingests per se may be done in any order, or in parallel, via for each h5ad_file_name:

    tiledbsoma.io.from_h5ad(
        experiment_uri,
        h5ad_file_name,
        measurement_name="RNA",
        ingest_mode="write",
        registration_mapping=rd,
    )
    

  • uns_keys – Only ingest the specified top-level uns keys. The default is to ingest them all. Use uns_keys=[] to not ingest any uns keys.

  • additional_metadata

    Optional metadata to add to the Experiment and all descendents. This is a coarse-grained mechanism for setting key-value pairs on all SOMA objects in an Experiment hierarchy. Metadata for particular objects is more commonly set like:

    with soma.open(uri, 'w') as exp:
        exp.metadata.update({"aaa": "BBB"})
        exp.obs.metadata.update({"ccc": 123})
    

Returns:

The URI of the newly created experiment.

Lifecycle

Maturing.