bw_processing.array_creation ============================ .. py:module:: bw_processing.array_creation Functions --------- .. autoapisummary:: bw_processing.array_creation.chunked bw_processing.array_creation.create_array bw_processing.array_creation.create_chunked_array bw_processing.array_creation.create_chunked_structured_array bw_processing.array_creation.create_structured_array bw_processing.array_creation.get_ncols bw_processing.array_creation.peek Module Contents --------------- .. py:function:: chunked(iterable, chunk_size) .. py:function:: create_array(iterable, nrows=None, dtype=np.float32) Create a numpy array data ``iterable``. Returns a filepath of a created file (if ``filepath`` is provided, or the array. ``iterable`` can be data already in memory, or a generator. ``nrows`` can be supplied, if known. If ``iterable`` has a length, it will be determined automatically. If ``nrows`` is not known, this function generates chunked arrays until ``iterable`` is exhausted, and concatenates them. Either ``nrows`` or ``ncols`` must be specified. .. py:function:: create_chunked_array(iterable, ncols, dtype=np.float32, bucket_size=500) Create a numpy array from an iterable of indeterminate length. Needed when we can't determine the length of the iterable ahead of time (e.g. for a generator or a database cursor), so can't create the complete array in memory in on step Creates a list of arrays with ``bucket_size`` rows until ``iterable`` is exhausted, then concatenates them. :param iterable: Iterable of data used to populate the array. :param ncols: Number of columns in the created array. :param dtype: Numpy dtype of the created array :param bucket_size: Number of rows in each intermediate array. Returns:. Returns the created array. Will return a zero-length array if ``iterable`` has no data. .. py:function:: create_chunked_structured_array(iterable, dtype, bucket_size=20000) Create a numpy structured array from an iterable of indeterminate length. Needed when we can't determine the length of the iterable ahead of time (e.g. for a generator or a database cursor), so can't create the complete array in memory in on step Creates a list of arrays with ``bucket_size`` rows until ``iterable`` is exhausted, then concatenates them. :param iterable: Iterable of data used to populate the array. :param dtype: Numpy dtype of the created array :param format_function: If provided, this function will be called on each row of ``iterable`` before insertion in the array. :param bucket_size: Number of rows in each intermediate array. Returns:. Returns the created array. Will return a zero-length array if ``iterable`` has no data. .. py:function:: create_structured_array(iterable, dtype, nrows=None, sort=False, sort_fields=None) Create a numpy `structured array `__ for data ``iterable``. Returns a filepath of a created file (if ``filepath`` is provided, or the array. ``iterable`` can be data already in memory, or a generator. ``nrows`` can be supplied, if known. If ``iterable`` has a length, it will be determined automatically. If ``nrows`` is not known, this function generates chunked arrays until ``iterable`` is exhausted, and concatenates them. .. py:function:: get_ncols(iterator) .. py:function:: peek(iterator)