bw2io.strategies.generic
#
Module Contents#
Functions#
|
Adds a database name to each dataset in a list of datasets. |
Assign only product as reference product. |
|
" |
|
Convert uncertainty types in a list of datasets to integers. |
|
Drop uncertainty fields that are falsey (e.g. '', None, False) but keep zero and NaN. |
|
|
Remove all exchanges in a given database that don't have inputs. |
|
Generate a formatted error message for a dataset that can't be uniquely linked to the target database. |
|
Link objects in |
|
Link technosphere exchanges using the activity_hash function. |
|
Normalize units in datasets and their exchanges. |
|
Set the dataset code for each dataset in the given database using activity_hash. |
|
Split unlinked exchanges in |
Convert the "categories" fields in a given database and its exchanges to tuples. |
- bw2io.strategies.generic.add_database_name(db, name)[source]#
Adds a database name to each dataset in a list of datasets.
Parameters#
- dblist[dict]
The list of datasets to add the database name to.
- namestr
The name of the database to be added to each dataset.
Returns#
- list[dict]
The updated list of datasets with the database name added to each dataset.
Examples#
>>> db = [{"id": 1, "name": "A"}, {"id": 2, "name": "B"}] >>> add_database_name(db, "X") [{'id': 1, 'name': 'A', 'database': 'X'}, {'id': 2, 'name': 'B', 'database': 'X'}]
An empty list input returns an empty list. >>> add_database_name([], “Y”) []
- bw2io.strategies.generic.assign_only_product_as_production(db)[source]#
Assign only product as reference product.
For each dataset in
db
, this function checks if there is only one production exchange and no reference product already assigned. If this is the case, the reference product is set to the name of the production exchange, and the following fields are replaced if not already specified:‘name’ - name of reference product
‘unit’ - unit of reference product
‘production amount’ - amount of reference product
Parameters#
- dbiterable
An iterable of dictionaries containing the datasets to process.
Returns#
- iterable
An iterable of dictionaries containing the processed datasets.
Raises#
- AssertionError
If a production exchange does not have a name attribute.
Examples#
>>> data = [{'name': 'Input 1', 'exchanges': [{'type': 'production', 'name': 'Product 1', 'amount': 1}, {'type': 'technosphere', 'name': 'Input 2', 'amount': 2}]}, {'name': 'Input 2', 'exchanges': [{'type': 'production', 'name': 'Product 2', 'amount': 3}, {'type': 'technosphere', 'name': 'Input 3', 'amount': 4}]}] >>> processed_data = assign_only_product_as_production(data) >>> processed_data[0]['reference product'] 'Product 1' >>> processed_data[0]['name'] 'Input 1' >>> processed_data[1]['reference product'] 'Product 2' >>> processed_data[1]['unit'] 'Unknown'
- bw2io.strategies.generic.convert_activity_parameters_to_list(data)[source]#
” Convert activity parameters from a dictionary to a list of dictionaries.
Parameters#
- datalist[dict]
The list of activities to convert parameters from.
Returns#
- list[dict]
The updated list of activities with parameters converted to a list of dictionaries.
Examples#
>>> data = [{"name": "A", "parameters": {"param1": 1, "param2": 2}}, {"name": "B", "parameters": {"param3": 3, "param4": 4}}] >>> convert_activity_parameters_to_list(data) [{'name': 'A', 'parameters': [{'name': 'param1', 1}, {'name': 'param2', 2}]}, {'name': 'B', 'parameters': [{'name': 'param3', 3}, {'name': 'param4', 4}]}]
Activities without parameters remain unchanged. >>> data = [{“name”: “C”}] >>> convert_activity_parameters_to_list(data) [{‘name’: ‘C’}]
- bw2io.strategies.generic.convert_uncertainty_types_to_integers(db)[source]#
Convert uncertainty types in a list of datasets to integers.
Parameters#
- dblist[dict]
The list of datasets containing uncertainty types to convert.
Returns#
- list[dict]
The updated list of datasets with uncertainty types converted to integers where possible.
Examples#
>>> db = [{"name": "A", "exchanges": [{"uncertainty type": "triangular"}]}, {"name": "B", "exchanges": [{"uncertainty type": "lognormal"}]}] >>> convert_uncertainty_types_to_integers(db) [{'name': 'A', 'exchanges': [{'uncertainty type': 'triangular'}]}, {'name': 'B', 'exchanges': [{'uncertainty type': 'lognormal'}]}]
Float values are rounded down to integers. >>> db = [{“name”: “C”, “exchanges”: [{“uncertainty type”: “1”}, {“uncertainty type”: “2.0”}]}] >>> convert_uncertainty_types_to_integers(db) [{‘name’: ‘C’, ‘exchanges’: [{‘uncertainty type’: 1}, {‘uncertainty type’: 2}]}]
- bw2io.strategies.generic.drop_falsey_uncertainty_fields_but_keep_zeros(db)[source]#
Drop uncertainty fields that are falsey (e.g. ‘’, None, False) but keep zero and NaN.
Note that this function doesn’t strip False, which behaves exactly like 0.
Parameters#
- dblist[dict]
The list of datasets to drop uncertainty fields from.
Returns#
- list[dict]
The updated list of datasets with falsey uncertainty fields dropped.
Examples#
>>> db = [{"name": "A", "exchanges": [{"amount": 1, "minimum": 0, "maximum": None, "shape": ""}]}] >>> drop_falsey_uncertainty_fields_but_keep_zeros(db) [{'name': 'A', 'exchanges': [{'amount': 1, 'minimum': 0}]}]
Float values of NaN are kept in the dictionary. >>> db = [{“name”: “B”, “exchanges”: [{“loc”: 0.0, “scale”: 0.5, “minimum”: float(‘nan’)},… {“loc”: 0.0, “scale”: 0.5}]}] >>> drop_falsey_uncertainty_fields_but_keep_zeros(db) [{‘name’: ‘B’, ‘exchanges’: [{‘loc’: 0.0, ‘scale’: 0.5, ‘minimum’: nan},{‘loc’: 0.0, ‘scale’: 0.5}]}]
- bw2io.strategies.generic.drop_unlinked(db)[source]#
Remove all exchanges in a given database that don’t have inputs.
Exchanges that don’t have any inputs are often referred to as “unlinked exchanges”. These exchanges can be a sign of an incomplete or poorly structured database.
Parameters#
- dbobj
The database to remove unlinked exchanges from.
Returns#
- obj
The modified database object with removed unlinked exchanges.
Notes#
This is the nuclear option - use at your own risk! ⚠️
Examples#
>>> db = [ ... {"name": "Product A", "unit": "kg", "exchanges": [{"input": True, "amount": 1, "name": "Input 1", "unit": "kg"}]}, ... {"name": "Product B", "unit": "kg", "exchanges": [{"input": True, "amount": 1, "name": "Input 2", "unit": "kg"}, {"input": False, "amount": 0.5, "name": "Product A", "unit": "kg"}]}, ... {"name": "Product C", "unit": "kg", "exchanges": [{"input": False, "amount": 0.75, "name": "Product A", "unit": "kg"}]} ... ] >>> drop_unlinked(db) [ {'name': 'Product A', 'unit': 'kg', 'exchanges': [{'input': True, 'amount': 1, 'name': 'Input 1', 'unit': 'kg'}]}, ... {'name': 'Product B', 'unit': 'kg', 'exchanges': [{'input': True, 'amount': 1, 'name': 'Input 2', 'unit': 'kg'}, ... {'input': False, 'amount': 0.5, 'name': 'Product A', 'unit': 'kg'}]}, ... {'name': 'Product C', 'unit': 'kg', 'exchanges': []} ]
- bw2io.strategies.generic.format_nonunique_key_error(obj, fields, others)[source]#
Generate a formatted error message for a dataset that can’t be uniquely linked to the target database.
- objdict
The problematic dataset that can’t be uniquely linked to the target database.
- fieldslist
The list of fields to include in the error message.
- otherslist
A list of other similar datasets.
- str
A formatted error message.
pprint.pformat : Format a Python object into a pretty-printed string.
This function is used to generate a formatted error message for a dataset that can’t be uniquely linked to the target database. It takes the problematic dataset and a list of other similar datasets and returns an error message that includes the problematic dataset and a list of possible target datasets that may match the problematic dataset.
None
>>> obj = {'name': 'Electricity', 'location': 'CH'} >>> fields = ['name', 'location'] >>> others = [{'name': 'Electricity', 'location': 'CH', 'filename': 'file1'}, {'name': 'Electricity', 'location': 'CH', 'filename': 'file2'}] >>> format_nonunique_key_error(obj, fields, others) "Object in source database can't be uniquely linked to target database.
Problematic dataset is: {‘name’: ‘Electricity’, ‘location’: ‘CH’} Possible targets include (at least one not shown): [{‘name’: ‘Electricity’, ‘location’: ‘CH’, ‘filename’: ‘file1’}, {‘name’: ‘Electricity’, ‘location’: ‘CH’, ‘filename’: ‘file2’}]”
- bw2io.strategies.generic.link_iterable_by_fields(unlinked, other=None, fields=None, kind=None, internal=False, relink=False)[source]#
Link objects in
unlinked
to objects inother
using fieldsfields
.Parameters#
- unlinkediterable
An iterable of dictionaries containing objects to be linked.
- otheriterable, optional
An iterable of dictionaries containing objects to link to. If not specified, other is set to unlinked.
- fieldsiterable, optional
An iterable of strings indicating which fields should be used to match objects. If not specified, all fields will be used.
- kindstr or iterable, optional
If specified, limit the exchange to objects of the given kind. kind can be a string or an iterable of strings.
- internalbool, optional
If True, link objects in unlinked to other objects in unlinked. Each object must have the attributes database and code.
- relinkbool, optional
If True, link to objects that already have an input. Otherwise, skip objects that have already been linked.
Returns#
- iterable
An iterable of dictionaries containing linked objects.
Raises#
- StrategyError
If not all datasets in the database to be linked have
database
orcode
attributes. If there are duplicate keys for the given fields.
See Also#
activity_hash : Generate a unique hash key for a dataset. format_nonunique_key_error : Generate an error message for datasets that can’t be uniquely linked to the target database.
Notes#
This function takes two iterables of dictionaries:
unlinked
andother
, where each dictionary represents an object to be linked. The objects are linked by matching their fieldsfields
. The function returns an iterable of dictionaries containing linked objects.If the parameter
kind
is specified, only objects of the given kind are linked. Ifinternal
is True, objects inunlinked
are linked to other objects inunlinked
. Ifrelink
is True, objects that already have an input are linked again.If a link is not unique, a
StrategyError
is raised, which includes a formatted error message generated by theformat_nonunique_key_error
function.Examples#
>>> data = [ ... {"exchanges": [ ... {"type": "A", "value": 1}, ... {"type": "B", "value": 2} ... ]}, ... {"exchanges": [ ... {"type": "C", "value": 3}, ... {"type": "D", "value": 4} ... ]} ... ] >>> other = [ ... {"database": "db1", "code": "A"}, ... {"database": "db2", "code": "C"} ... ] >>> linked = link_iterable_by_fields(data, other=other, fields=["code"]) >>> linked[0]["exchanges"][0]["input"] ('db1', 'A') >>> linked[1]["exchanges"][0]["input"] ('db2', 'C')
- bw2io.strategies.generic.link_technosphere_by_activity_hash(db, external_db_name=None, fields=None)[source]#
Link technosphere exchanges using the activity_hash function. If
external_db_name
is provided, link technosphere exchanges against an external database, otherwise link internally.Parameters#
- dbobj
The database to link exchanges in.
- external_db_namestr, optional
The name of an external database to link against. Default is None.
- fieldslist of str, optional
The fields to use for linking exchanges. If None, all fields will be used.
Returns#
- linkedlist of tuples
A list of tuples representing the linked exchanges.
Raises#
- StrategyError
If the external database name provided is not found in the list of available databases.
Examples#
Link technosphere exchanges internally:
>>> db = Database('example_db') >>> linked = link_technosphere_by_activity_hash(db)
Link technosphere exchanges against an external database using specific fields:
>>> linked = link_technosphere_by_activity_hash(db, external_db_name='other_db', fields=['name', 'unit'])
- bw2io.strategies.generic.normalize_units(db)[source]#
Normalize units in datasets and their exchanges.
Parameters#
- dbdict
The database that needs to be normalized.
Returns#
- dict
The normalized database.
Examples#
Example 1: Normalize the units of a given database.
>>> db = {'name': 'test_db', 'unit': 'kg'} >>> normalize_units(db) {'name': 'test_db', 'unit': 'kilogram'}
Example 2: Normalize the units of a dataset and its exchanges.
>>> db = { ... 'name': 'test_db', ... 'unit': 'kg', ... 'exchanges': [ ... {'name': 'input', 'unit': 't'}, ... {'name': 'output', 'unit': 'lb'}, ... ] ... } >>> normalize_units(db) {'name': 'test_db', 'unit': 'kilogram', 'exchanges': [ {'name': 'input', 'unit': 'tonne'}, {'name': 'output', 'unit': 'pound'} ]}
- bw2io.strategies.generic.set_code_by_activity_hash(db, overwrite=False)[source]#
Set the dataset code for each dataset in the given database using activity_hash.
Parameters#
- dbobj
The database to set the dataset codes in.
- overwritebool, optional
Whether to overwrite existing codes. Default is False.
Returns#
- obj
The modified database object with updated dataset codes.
Notes#
The dataset code is a unique identifier for each dataset in the database. It is generated by hashing the dataset dictionary with activity_hash.
Examples#
>>> db = Database('example_db') >>> set_code_by_activity_hash(db)
- bw2io.strategies.generic.split_exchanges(data, filter_params, changed_attributes, allocation_factors=None)[source]#
Split unlinked exchanges in
data
which satisfyfilter_params
into new exchanges with changed attributes.changed_attributes
is a list of dictionaries with the attributes that should be changed.allocation_factors
is an optional list of floats to allocate the original exchange amount to the respective copies defined inchanged_attributes
. They don’t have to sum to one. Ifallocation_factors
are not defined, then exchanges are split equally.Resets uncertainty to
UndefinedUncertainty
(0).To use this function as a strategy, you will need to curry it first using
functools.partial
.Parameters#
- datalist[dict]
The list of activities to split exchanges in.
- filter_paramsdict
A dictionary of filter parameters to apply to the exchanges that will be split.
- changed_attributeslist[dict]
A list of dictionaries with the attributes that should be changed in the new exchanges.
- allocation_factorsOptional[List[float]], optional
An optional list of floats to allocate the original exchange amount to the respective copies defined in
changed_attributes
, by default None. Ifallocation_factors
are not defined, then exchanges are split equally.
Returns#
- list[dict]
The updated list of activities with exchanges split.
Examples#
>>> data = [{"name": "A", "exchanges": [{"name": "foo", "location": "bar", "amount": 20}, {"name": "food", "location": "bar", "amount": 12}]}] >>> split_exchanges(data, {"name": "foo"}, [{"location": "A"}, {"location": "B", "cat": "dog"}]) [{'name': 'A', 'exchanges': [{'name': 'food', 'location': 'bar', 'amount': 12}, {'name': 'foo', 'location': 'A', 'amount': 12.0, 'uncertainty_type': 0}, {'name': 'foo', 'location': 'B', 'amount': 8.0, 'uncertainty_type': 0, 'cat': 'dog'}]}] >>> data = [{"name": "B", "exchanges": [{"name": "bar", "location": "foo", "amount": 25}, {"name": "bard", "location": "foo", "amount": 13}]}] >>> split_exchanges(data, {"name": "bard", "location": "foo"}, [{"name": "new", "location": "bar"}], [0.3]) [{'name': 'B', 'exchanges': [{'name': 'bar', 'location': 'foo', 'amount': 25}, {'name': 'new', 'location': 'bar', 'amount': 3.9000000000000004, 'uncertainty_type': 0}]}]
- bw2io.strategies.generic.tupleize_categories(db)[source]#
Convert the “categories” fields in a given database and its exchanges to tuples.
Parameters#
- dbobj
The database to convert categories in.
Returns#
- obj
The modified database object with converted category fields.
Examples#
>>> from bw2data import Database >>> db = Database('example_db') >>> tupleize_categories(db)