bw_processing.io_pyarrow_helpers
This module contains some helpers to convert nympy.ndarrays to/from Apache Arrow Table.
We use pyarrow.Table objects to save/retrieve data into/from parquet format files. We use a metadata section in the pyarrow.Table (and the parquet files) to be able to recognize what type of data was serialized. Specific and generic codes exist.
The metadata object is a dict object that looks like this:
{“object”: “vector”, “type”: “generic”}. object can be vector (ndim == 1) or matrix (ndim == 2), type can be:
indices (dtype is INDICES_DTYPE);
distributions (dtype is UNCERTAINTY_DTYPE);
generic (dtype is a common type);
Attributes
Functions
|
Convert a specific distributions (numpy) vector to a (arrow) table. |
Convert a generic (numpy) matrix to a (arrow) table. |
|
Convert a generic (numpy) vector to a (arrow) table. |
|
Convert a specific indices (numpy) vector to a (arrow) table. |
|
|
Convert a specific distributions (arrow) vector table to a (numpy) array. |
Convert a generic (arrow) matrix table to a (numpy) array. |
|
Convert a generic (arrow) vector table to a (numpy) array. |
|
Convert a specific indices (arrow) vector table to a (numpy) array. |
Module Contents
- bw_processing.io_pyarrow_helpers.numpy_distributions_vector_to_pyarrow_distributions_vector_table(arr: numpy.ndarray) pyarrow.Table[source]
Convert a specific distributions (numpy) vector to a (arrow) table.
- Parameters:
arr (np.ndarray) – A numpy array that corresponds to a distributions vector, i.e. its dimension is 1 and its dtype is UNCERTAINTY_DTYPE.
- See:
pyarrow_distributions_vector_table_to_numpy_distributions_vector
- Returns:
The corresponding pyarrow.Table object.
- bw_processing.io_pyarrow_helpers.numpy_generic_matrix_to_pyarrow_generic_matrix_table(arr: numpy.ndarray) pyarrow.Table[source]
Convert a generic (numpy) matrix to a (arrow) table.
- Parameters:
arr (ndarray) – A numpy array that corresponds to a generic matrix, i.e. its dimension is 2.
- See:
pyarrow_generic_matrix_table_to_numpy_generic_matrix.
- Returns:
The corresponding pyarrow.Table object.
- bw_processing.io_pyarrow_helpers.numpy_generic_vector_to_pyarrow_generic_vector_table(arr: numpy.ndarray) pyarrow.Table[source]
Convert a generic (numpy) vector to a (arrow) table.
- Parameters:
arr (ndarray) – A numpy array that corresponds to a vector, i.e. its dimension is 1.
- See:
pyarrow_generic_vector_table_to_numpy_generic_vector.
- Returns:
The corresponding pyarrow.Table object.
- bw_processing.io_pyarrow_helpers.numpy_indices_vector_to_pyarrow_indices_vector_table(arr: numpy.ndarray) pyarrow.Table[source]
Convert a specific indices (numpy) vector to a (arrow) table.
- Parameters:
arr (ndarray) – A numpy array that corresponds to an indices vector, i.e. its dimension is 1 and its dtype is INDICES_DTYPE.
- See:
pyarrow_indices_vector_table_to_numpy_indices_vector.
- Returns:
The corresponding pyarrow.Table object.
- bw_processing.io_pyarrow_helpers.pyarrow_distributions_vector_table_to_numpy_distributions_vector(table: pyarrow.Table) numpy.ndarray[source]
Convert a specific distributions (arrow) vector table to a (numpy) array.
- Parameters:
table (pa.Table) – A pyarrow table that corresponds to a distributions vector.
- See:
numpy_distributions_vector_to_pyarrow_distributions_vector_table.
- Returns:
The corresponding np.ndarray object.
- bw_processing.io_pyarrow_helpers.pyarrow_generic_matrix_table_to_numpy_generic_matrix(table: pyarrow.Table) numpy.ndarray[source]
Convert a generic (arrow) matrix table to a (numpy) array.
- Parameters:
table (pa.Table) – A pyarrow table that corresponds to a generic matrix.
- See:
numpy_generic_matrix_to_pyarrow_generic_matrix_table.
- Returns:
The corresponding np.ndarray object.
- bw_processing.io_pyarrow_helpers.pyarrow_generic_vector_table_to_numpy_generic_vector(table: pyarrow.Table) numpy.ndarray[source]
Convert a generic (arrow) vector table to a (numpy) array.
- Parameters:
table (pa.Table) – A pyarrow table that corresponds to a vector.
- See:
numpy_generic_vector_to_pyarrow_generic_vector_table.
- Returns:
The corresponding np.ndarray object.
- bw_processing.io_pyarrow_helpers.pyarrow_indices_vector_table_to_numpy_indices_vector(table: pyarrow.Table) numpy.ndarray[source]
Convert a specific indices (arrow) vector table to a (numpy) array.
- Parameters:
table (pa.Table) – A pyarrow table that corresponds to an indices vector.
- See:
numpy_indices_vector_to_pyarrow_indices_vector_table.
- Returns:
The corresponding np.ndarray object.