bw_processing.io_parquet_helpers
This module contains some helpers to serialize/deserialize numpy.ndarray objects to/from Apache parquet files. We convert the nympy.ndarray objects to pyarrow.Table objects to do so.
Functions
|
Deserialize a numpy ndarray from a parquet file. |
|
Read an ndarray from a parquet file. |
|
Serialize a numpy ndarray to a parquet file. |
|
Serialize ndarray objects to file. |
Module Contents
- bw_processing.io_parquet_helpers.load_ndarray_from_parquet(file: io.RawIOBase) numpy.ndarray[source]
Deserialize a numpy ndarray from a parquet file.
- Parameters
file (io.RawIOBase or fsspec file object): File to read from.
- Returns
The corresponding numpy ndarray.
- bw_processing.io_parquet_helpers.read_parquet_file_to_ndarray(file: io.RawIOBase) numpy.ndarray[source]
Read an ndarray from a parquet file.
- Parameters:
file (io.RawIOBase or fsspec file object) – File to read from.
- Raises:
- Returns:
The corresponding numpy ndarray.
- bw_processing.io_parquet_helpers.save_arr_to_parquet(file: io.RawIOBase, arr: numpy.ndarray, meta_object: str, meta_type: str) None[source]
Serialize a numpy ndarray to a parquet file.
- Parameters
file (RawIOBase): The file to save to. arr (ndarray): The array object to save. meta_object (str): “vector” or “matrix”. meta_type (str): Type of object to serialize (see io_pyarrow_helpers.py).
- bw_processing.io_parquet_helpers.write_ndarray_to_parquet_file(file: io.BufferedWriter, arr: numpy.ndarray, meta_object: str, meta_type: str)[source]
Serialize ndarray objects to file.
- Parameters
file (io.BufferedWriter): File to save to. arr (ndarray): Array to serialize. meta_object (str): “vector” or “matrix”. meta_type (str): Type of object to serialize (see io_pyarrow_helpers.py).