bw2io.strategies.generic#

Functions#

`add_database_name`(→ List[dict])	Adds a database name to each dataset in a list of datasets.
`assign_only_product_as_production`(→ List[dict])	Assign only product as reference product.
`convert_activity_parameters_to_list`(→ List[dict])
`convert_uncertainty_types_to_integers`(→ List[dict])	Convert uncertainty types in a list of datasets to integers.
`drop_falsey_uncertainty_fields_but_keep_zeros`(→ List[dict])	Drop uncertainty fields that are falsey (e.g. '', None, False) but keep zero and NaN.
`drop_unlinked`(→ List[dict])	Remove all exchanges in a given database that don't have inputs.
`format_nonunique_key_error`(→ str)	Generate a formatted error message for a dataset that can't be uniquely linked to the target
`link_iterable_by_fields`(→ List[dict])	Link objects in `unlinked` to objects in `other` using fields `fields`.
`link_technosphere_by_activity_hash`(db[, ...])	Link technosphere exchanges using the activity_hash function.
`match_against_only_available_in_given_context_tree`(...)	For unlinked edges with a categories context ('a', 'b', ...), try to match against flows in
`match_against_top_level_context`(→ List[dict])	For unlinked edges with a categories context ('a', 'b', ...), try to match against flows in
`normalize_units`(→ List[dict])	Normalize units in datasets and their exchanges.
`set_code_by_activity_hash`(→ List[dict])	Set the dataset code for each dataset in the given database using activity_hash.
`split_exchanges`(→ List[dict])	Split unlinked exchanges in `data` which satisfy `filter_params` into new exchanges with changed attributes.
`tupleize_categories`(→ List[dict])	Convert the "categories" fields in a given database and its exchanges to tuples.

Module Contents#

bw2io.strategies.generic.add_database_name(db: List[dict], name: str) → List[dict][source]#

Adds a database name to each dataset in a list of datasets.

Parameters:

db (list[dict]) – The list of datasets to add the database name to.
name (str) – The name of the database to be added to each dataset.

Returns:

The updated list of datasets with the database name added to each dataset.

Return type:

list[dict]

Examples

>>> db = [{"id": 1, "name": "A"}, {"id": 2, "name": "B"}]
>>> add_database_name(db, "X")
[{'id': 1, 'name': 'A', 'database': 'X'}, {'id': 2, 'name': 'B', 'database': 'X'}]

An empty list input returns an empty list. >>> add_database_name([], “Y”) []

bw2io.strategies.generic.assign_only_product_as_production(db: Iterable[dict]) → List[dict][source]#

Assign only product as reference product.

For each dataset in db, this function checks if there is only one production exchange and no reference product already assigned. If this is the case, the reference product is set to the name of the production exchange, and the following fields are replaced if not already specified:

‘name’ - name of reference product
‘unit’ - unit of reference product
‘production amount’ - amount of reference product

Parameters:: db (iterable) – An iterable of dictionaries containing the datasets to process.
Returns:: An iterable of dictionaries containing the processed datasets.
Return type:: iterable
Raises:: AssertionError – If a production exchange does not have a name attribute.

Examples

>>> data = [{'name': 'Input 1', 'exchanges': [{'type': 'production', 'name': 'Product 1', 'amount': 1}, {'type': 'technosphere', 'name': 'Input 2', 'amount': 2}]}, {'name': 'Input 2', 'exchanges': [{'type': 'production', 'name': 'Product 2', 'amount': 3}, {'type': 'technosphere', 'name': 'Input 3', 'amount': 4}]}]
>>> processed_data = assign_only_product_as_production(data)
>>> processed_data[0]['reference product']
'Product 1'
>>> processed_data[0]['name']
'Input 1'
>>> processed_data[1]['reference product']
'Product 2'
>>> processed_data[1]['unit']
'Unknown'

bw2io.strategies.generic.convert_activity_parameters_to_list(data: List[dict]) → List[dict][source]#

” Convert activity parameters from a dictionary to a list of dictionaries.

Parameters:: data (list[dict]) – The list of activities to convert parameters from.
Returns:: The updated list of activities with parameters converted to a list of dictionaries.
Return type:: list[dict]

Examples

>>> data = [{"name": "A", "parameters": {"param1": 1, "param2": 2}}, {"name": "B", "parameters": {"param3": 3, "param4": 4}}]
>>> convert_activity_parameters_to_list(data)
[{'name': 'A', 'parameters': [{'name': 'param1', 1}, {'name': 'param2', 2}]}, {'name': 'B', 'parameters': [{'name': 'param3', 3}, {'name': 'param4', 4}]}]

Activities without parameters remain unchanged. >>> data = [{“name”: “C”}] >>> convert_activity_parameters_to_list(data) [{‘name’: ‘C’}]

bw2io.strategies.generic.convert_uncertainty_types_to_integers(db: List[dict]) → List[dict][source]#

Convert uncertainty types in a list of datasets to integers.

Parameters:: db (list[dict]) – The list of datasets containing uncertainty types to convert.
Returns:: The updated list of datasets with uncertainty types converted to integers where possible.
Return type:: list[dict]

Examples

>>> db = [{"name": "A", "exchanges": [{"uncertainty type": "triangular"}]}, {"name": "B", "exchanges": [{"uncertainty type": "lognormal"}]}]
>>> convert_uncertainty_types_to_integers(db)
[{'name': 'A', 'exchanges': [{'uncertainty type': 'triangular'}]}, {'name': 'B', 'exchanges': [{'uncertainty type': 'lognormal'}]}]

Float values are rounded down to integers. >>> db = [{“name”: “C”, “exchanges”: [{“uncertainty type”: “1”}, {“uncertainty type”: “2.0”}]}] >>> convert_uncertainty_types_to_integers(db) [{‘name’: ‘C’, ‘exchanges’: [{‘uncertainty type’: 1}, {‘uncertainty type’: 2}]}]

bw2io.strategies.generic.drop_falsey_uncertainty_fields_but_keep_zeros(db: List[dict]) → List[dict][source]#

Drop uncertainty fields that are falsey (e.g. ‘’, None, False) but keep zero and NaN.

Note that this function doesn’t strip False, which behaves exactly like 0.

Parameters:: db (list[dict]) – The list of datasets to drop uncertainty fields from.
Returns:: The updated list of datasets with falsey uncertainty fields dropped.
Return type:: list[dict]

Examples

>>> db = [{"name": "A", "exchanges": [{"amount": 1, "minimum": 0, "maximum": None, "shape": ""}]}]
>>> drop_falsey_uncertainty_fields_but_keep_zeros(db)
[{'name': 'A', 'exchanges': [{'amount': 1, 'minimum': 0}]}]

Float values of NaN are kept in the dictionary. >>> db = [{“name”: “B”, “exchanges”: [{“loc”: 0.0, “scale”: 0.5, “minimum”: float(‘nan’)},… {“loc”: 0.0, “scale”: 0.5}]}] >>> drop_falsey_uncertainty_fields_but_keep_zeros(db) [{‘name’: ‘B’, ‘exchanges’: [{‘loc’: 0.0, ‘scale’: 0.5, ‘minimum’: nan},{‘loc’: 0.0, ‘scale’: 0.5}]}]

bw2io.strategies.generic.drop_unlinked(db: List[dict]) → List[dict][source]#

Remove all exchanges in a given database that don’t have inputs.

Exchanges that don’t have any inputs are often referred to as “unlinked exchanges”. These exchanges can be a sign of an incomplete or poorly structured database.

Parameters:: db (obj) – The database to remove unlinked exchanges from.
Returns:: The modified database object with removed unlinked exchanges.
Return type:: obj

Notes

This is the nuclear option - use at your own risk! ⚠️

Examples

>>> db = [
...    {"name": "Product A", "unit": "kg", "exchanges": [{"input": True, "amount": 1, "name": "Input 1", "unit": "kg"}]},
...    {"name": "Product B", "unit": "kg", "exchanges": [{"input": True, "amount": 1, "name": "Input 2", "unit": "kg"}, {"input": False, "amount": 0.5, "name": "Product A", "unit": "kg"}]},
...    {"name": "Product C", "unit": "kg", "exchanges": [{"input": False, "amount": 0.75, "name": "Product A", "unit": "kg"}]}
... ]
>>> drop_unlinked(db)
[
    {'name': 'Product A', 'unit': 'kg', 'exchanges': [{'input': True, 'amount': 1, 'name': 'Input 1', 'unit': 'kg'}]},
... {'name': 'Product B', 'unit': 'kg', 'exchanges': [{'input': True, 'amount': 1, 'name': 'Input 2', 'unit': 'kg'},
... {'input': False, 'amount': 0.5, 'name': 'Product A', 'unit': 'kg'}]},
... {'name': 'Product C', 'unit': 'kg', 'exchanges': []}
]

bw2io.strategies.generic.format_nonunique_key_error(obj: dict, fields: List[str], others: List[dict]) → str[source]#

Generate a formatted error message for a dataset that can’t be uniquely linked to the target database.

Parameters:

obj (dict) – The problematic dataset that can’t be uniquely linked to the target database.
fields (list) – The list of fields to include in the error message.
others (list) – A list of other similar datasets.

Returns:

A formatted error message.

Return type:

str

See also

activity_hash: Generate a unique hash key for a dataset.
format_nonunique_key_error: Generate an error message for datasets that can’t be uniquely

linked

Notes

This function takes two iterables of dictionaries: unlinked and other, where each dictionary represents an object to be linked. The objects are linked by matching their fields fields. The function returns an iterable of dictionaries containing linked objects.

If the parameter kind is specified, only objects of the given kind are linked. If internal is True, objects in unlinked are linked to other objects in unlinked. If relink is True, objects that already have an input are linked again.

If a link is not unique, a StrategyError is raised, which includes a formatted error message generated by the format_nonunique_key_error function.

Examples

>>> data = [
...     {"exchanges": [
...         {"type": "A", "value": 1},
...         {"type": "B", "value": 2}
...     ]},
...     {"exchanges": [
...         {"type": "C", "value": 3},
...         {"type": "D", "value": 4}
...     ]}
... ]
>>> other = [
...     {"database": "db1", "code": "A"},
...     {"database": "db2", "code": "C"}
... ]
>>> linked = link_iterable_by_fields(data, other=other, fields=["code"])
>>> linked[0]["exchanges"][0]["input"]
('db1', 'A')
>>> linked[1]["exchanges"][0]["input"]
('db2', 'C')

bw2io.strategies.generic.link_technosphere_by_activity_hash(db, external_db_name: str | None = None, fields: Iterable[str] | None = None)[source]#

Link technosphere exchanges using the activity_hash function.

If external_db_name is provided, link technosphere exchanges against an external database, otherwise link internally.

Parameters:

db (obj) – The database to link exchanges in.
external_db_name (str, optional) – The name of an external database to link against. Default is None.
fields (list of str, optional) – The fields to use for linking exchanges. If None, all fields will be used.

Returns:

linked – A list of tuples representing the linked exchanges.

Return type:

list of tuples

Raises:

StrategyError – If the external database name provided is not found in the list of available databases.

Examples

Link technosphere exchanges internally:

>>> db = Database('example_db')
>>> linked = link_technosphere_by_activity_hash(db)

Link technosphere exchanges against an external database using specific fields:

>>> linked = link_technosphere_by_activity_hash(
...     db,
...     external_db_name='other_db',
...     fields=['name', 'unit']
... )

bw2io.strategies.generic.match_against_only_available_in_given_context_tree(data: List[dict], other_db_name: str, fields: List[str] = ['name', 'unit', 'categories'], kinds: List[str] = labels.biosphere_edge_types) → List[dict][source]#

For unlinked edges with a categories context (‘a’, ‘b’, …), try to match against flows in other_db_name with categories context (‘a’, ‘c’’) if that flow is the only one available in other_db_name within the context tree (‘a’,).

To use this function as a strategy, you will need to curry it first using functools.partial.

Parameters:

data (list[dict]) – The list of activities to split exchanges in.
other_db_name (str) – The name of the database with flows to link to.
fields (list[str]) – List of field names to use when determining if there is a match
kinds (list[str]) – Try to match exchanges with these type values

bw2io.strategies.generic.match_against_top_level_context(data: List[dict], other_db_name: str, fields: List[str] = ['name', 'unit', 'categories'], kinds: List[str] = labels.biosphere_edge_types) → List[dict][source]#

For unlinked edges with a categories context (‘a’, ‘b’, …), try to match against flows in other_db_name with categories context (‘a’,).

To use this function as a strategy, you will need to curry it first using functools.partial.

Parameters:

data (list[dict]) – The list of activities to split exchanges in.
other_db_name (str) – The name of the database with flows to link to.
fields (list[str]) – List of field names to use when determining if there is a match
kinds (list[str]) – Try to match exchanges with these type values

bw2io.strategies.generic.normalize_units(db: List[dict]) → List[dict][source]#

Normalize units in datasets and their exchanges.

Parameters:: db (list[dict]) – The database that needs to be normalized.
Returns:: The normalized database.
Return type:: list[dict]

Examples

Example 1: Normalize the units of a given database.

>>> db = {'name': 'test_db', 'unit': 'kg'}
>>> normalize_units(db)
{'name': 'test_db', 'unit': 'kilogram'}

Example 2: Normalize the units of a dataset and its exchanges.

>>> db = {
...     'name': 'test_db',
...     'unit': 'kg',
...     'exchanges': [
...         {'name': 'input', 'unit': 't'},
...         {'name': 'output', 'unit': 'lb'},
...     ]
... }
>>> normalize_units(db)
{'name': 'test_db',
 'unit': 'kilogram',
 'exchanges': [
     {'name': 'input', 'unit': 'tonne'},
     {'name': 'output', 'unit': 'pound'}
 ]}

bw2io.strategies.generic.set_code_by_activity_hash(db: List[dict], overwrite: bool = False) → List[dict][source]#

Set the dataset code for each dataset in the given database using activity_hash.

Parameters:

db (obj) – The database to set the dataset codes in.
overwrite (bool, optional) – Whether to overwrite existing codes. Default is False.

Returns:

The modified database object with updated dataset codes.

Return type:

obj

Notes

The dataset code is a unique identifier for each dataset in the database. It is generated by hashing the dataset dictionary with activity_hash.

Examples

>>> db = Database('example_db')
>>> set_code_by_activity_hash(db)

bw2io.strategies.generic.split_exchanges(data: List[dict], filter_params: dict, changed_attributes: List[dict], allocation_factors: List[float] | None = None) → List[dict][source]#

Split unlinked exchanges in data which satisfy filter_params into new exchanges with changed attributes.

changed_attributes is a list of dictionaries with the attributes that should be changed.

allocation_factors is an optional list of floats to allocate the original exchange amount to the respective copies defined in changed_attributes. They don’t have to sum to one. If allocation_factors are not defined, then exchanges are split equally.

Resets uncertainty to UndefinedUncertainty (0).

To use this function as a strategy, you will need to curry it first using functools.partial.

Parameters:

data (list[dict]) – The list of activities to split exchanges in.
filter_params (dict) – A dictionary of filter parameters to apply to the exchanges that will be split.
changed_attributes (list[dict]) – A list of dictionaries with the attributes that should be changed in the new exchanges.
allocation_factors (Optional[List[float]], optional) – An optional list of floats to allocate the original exchange amount to the respective copies defined in changed_attributes, by default None. If allocation_factors are not defined, then exchanges are split equally.

Returns:

The updated list of activities with exchanges split.

Return type:

list[dict]

Examples

>>> data = [{"name": "A", "exchanges": [{"name": "foo", "location": "bar", "amount": 20}, {"name": "food", "location": "bar", "amount": 12}]}]
>>> split_exchanges(data, {"name": "foo"}, [{"location": "A"}, {"location": "B", "cat": "dog"}])
[{'name': 'A', 'exchanges': [{'name': 'food', 'location': 'bar', 'amount': 12}, {'name': 'foo', 'location': 'A', 'amount': 12.0, 'uncertainty_type': 0}, {'name': 'foo', 'location': 'B', 'amount': 8.0, 'uncertainty_type': 0, 'cat': 'dog'}]}]
>>> data = [{"name": "B", "exchanges": [{"name": "bar", "location": "foo", "amount": 25}, {"name": "bard", "location": "foo", "amount": 13}]}]
>>> split_exchanges(data, {"name": "bard", "location": "foo"}, [{"name": "new", "location": "bar"}], [0.3])
[{'name': 'B', 'exchanges': [{'name': 'bar', 'location': 'foo', 'amount': 25}, {'name': 'new', 'location': 'bar', 'amount': 3.9000000000000004, 'uncertainty_type': 0}]}]

bw2io.strategies.generic.tupleize_categories(db: List[dict]) → List[dict][source]#

Convert the “categories” fields in a given database and its exchanges to tuples.

Parameters:: db (obj) – The database to convert categories in.
Returns:: The modified database object with converted category fields.
Return type:: obj

Examples

>>> from bw2data import Database
>>> db = Database('example_db')
>>> tupleize_categories(db)

bw2io.strategies.generic#

Functions#

Module Contents#

This Page