mass.simulation.ensemble

Module to create and manage an ensemble of models.

The method of the ensemble submodule are designed to generate an ensemble of models. It contains various methods to assist in generating multiple models from existing MassModels, using flux data or concentration data in pandas.DataFrames (e.g. generated from conc_sampling). There are also methods to help ensure that models are thermodynamically feasible and can reach steady states with or without perturbations applied.

In addition to containing various methods that can be combined into an ensemble generation workflow, the ensemble submodule contains the generate_ensemble_of_models() function, which is optimized for performance when generating a large number of models.

The generate_ensemble_of_models() function also ensures that the user input is valid before generating models to reduce the likelihood of a user error causing the model generation process to stop before completion. However, there is time spent in function’s setup, meaining that when generating a smaller number of models, performance gains may not be seen.

Module Contents

Functions

create_models_from_flux_data(reference_model[, data, ...])

Generate ensemble of models for a given set of flux data.

create_models_from_concentration_data(reference_model)

Generate ensemble of models for a given set of concentration data.

ensure_positive_percs(models[, reactions, ...])

Seperate models based on whether all calculated PERCs are positive.

ensure_steady_state(models[, strategy, perturbations, ...])

Seperate models based on whether a steady state can be reached.

generate_ensemble_of_models(reference_model[, ...])

Generate an ensemble of models for given data sets.

mass.simulation.ensemble.create_models_from_flux_data(reference_model, data=None, raise_error=False, **kwargs)[source]

Generate ensemble of models for a given set of flux data.

Parameters
  • reference_model (iterable, None) – A MassModel object to treat as the reference model.

  • data (pandas.DataFrame) – A pandas.DataFrame containing the flux data for generation of the models. Each row is a different set of flux values to generate a model for, and each column corresponds to the reaction identifier for the flux value.

  • raise_error (bool) – Whether to raise an error upon failing to generate a model from a given reference. Default is False.

  • **kwargs

    verbose :

    bool indicating the verbosity of the function.

    Default is False.

    suffix :

    str representing the suffix to append to generated models.

    Default is '_F'.

Returns

new_models – A list of successfully generated MassModel objects.

Return type

list

Raises

MassEnsembleError – Raised if generation of a model fails and raise_error=True.

mass.simulation.ensemble.create_models_from_concentration_data(reference_model, data=None, raise_error=False, **kwargs)[source]

Generate ensemble of models for a given set of concentration data.

Parameters
  • reference_model (iterable, None) – A MassModel object to treat as the reference model.

  • data (pandas.DataFrame) – A pandas.DataFrame containing the concentration data for generation of the models. Each row is a different set of concentration values to generate a model for, and each column corresponds to the metabolite identifier for the concentraiton value.

  • raise_error (bool) – Whether to raise an error upon failing to generate a model from a given reference. Default is False.

  • **kwargs

    verbose :

    bool indicating the verbosity of the function.

    Default is False.

    suffix :

    str representing the suffix to append to generated models.

    Default is '_C'.

Returns

new_models – A list of successfully generated MassModel objects.

Return type

list

Raises

MassEnsembleError – Raised if generation of a model fails and raise_error=True.

mass.simulation.ensemble.ensure_positive_percs(models, reactions=None, raise_error=False, update_values=False, **kwargs)[source]

Seperate models based on whether all calculated PERCs are positive.

Parameters
  • models (iterable) – An iterable of MassModel objects to use for PERC calculations.

  • reactions (iterable) – An iterable of reaction identifiers to calculate the PERCs for. If None, all reactions in the model will be used.

  • raise_error (bool) – Whether to raise an error upon failing to generate a model from a given reference. Default is False.

  • update_values (bool) – Whether to update the PERC values for models that generate all positive PERCs. Default is False.

  • **kwargs

    verbose :

    bool indicating the verbosity of the function.

    Default is False.

    at_equilibrium_default :

    float value to set the pseudo-order rate constant if the reaction is at equilibrium.

    Default is 100,000.

Returns

  • tuple (positive, negative)

  • positive (list) – A list of MassModel objects whose calculated PERC values were postiive.

  • negative (list) – A list of MassModel objects whose calculated PERC values were negative.

Raises

MassEnsembleError – Raised if PERC calculation fails and raise_error=True.

mass.simulation.ensemble.ensure_steady_state(models, strategy='simulate', perturbations=None, solver_options=None, update_values=False, **kwargs)[source]

Seperate models based on whether a steady state can be reached.

All kwargs are passed to find_steady_state().

Parameters
  • models (MassModel, iterable) – A MassModel or an iterable of MassModel objects to find a steady state for.

  • strategy (str) –

    The strategy for finding the steady state. Must be one of the following:

    • 'simulate'

    • 'nleq1'

    • 'nleq2'

  • perturbations (dict) – A dict of perturbations to incorporate into the simulation. Models must reach a steady state with the given pertubration to be considered as feasible. See simulation documentation for more information on valid perturbations.

  • solver_options (dict) – A dict of options to pass to the solver utilized in determining a steady state. Solver options should be for the roadrunner.Integrator if strategy="simulate", otherwise options should correspond to the roadrunner.SteadyStateSolver.

  • update_values (bool) – Whether to update the model with the steady state results. Default is False. Only updates models that reached steady state.

  • **kwargs

    verbose :

    bool indicating the verbosity of the method.

    Default is False.

    steps :

    int indicating number of steps at which the output is sampled where the samples are evenly spaced and steps = (number of time points) - 1. Steps and number of time points may not both be specified. Only valid for strategy='simulate'.

    Default is None.

    tfinal :

    float indicating the final time point to use in when simulating to long times to find a steady state. Only valid for strategy='simulate'.

    Default is 1e8.

    num_attempts :

    int indicating the number of attempts the steady state solver should make before determining that a steady state cannot be found. Only valid for strategy='nleq1' or strategy='nleq2'.

    Default is 2.

    decimal_precision :

    bool indicating whether to apply the decimal_precision attribute of the MassConfiguration to the solution values.

    Default is False.

Returns

  • tuple (feasible, infeasible)

  • feasible (list) – A list of MassModel objects that could successfully reach a steady state.

  • infeasible (list) – A list of MassModel objects that could not successfully reach a steady state.

mass.simulation.ensemble.generate_ensemble_of_models(reference_model, flux_data=None, conc_data=None, ensure_positive_percs=None, strategy=None, perturbations=None, **kwargs)[source]

Generate an ensemble of models for given data sets.

This function is optimized for performance when generating a large ensemble of models when compared to the combination of various individual methods of the ensemble submodule used. However, this function may not provide as much control over the process when compared to utilizing a combination of other methods defined in the ensemble submodule.

Notes

  • Only one data set is required to generate the ensemble, meaning that a flux data set can be given without a concentration data set, and vice versa.

  • If x flux data samples and y concentration data samples are provided, x * y total models will be generated.

  • If models deemed infeasible are to be returned, ensure the return_infeasible kwarg is set to True.

Parameters
  • reference_model (MassModel) – The reference model used in generating the ensemble.

  • flux_data (pandas.DataFrame or None) – A pandas.DataFrame containing the flux data for generation of the models. Each row is a different set of flux values to generate a model for, and each column corresponds to the reaction identifier for the flux value.

  • conc_data (pandas.DataFrame or None) – A pandas.DataFrame containing the concentration data for generation of the models. Each row is a different set of concentration values to generate a model for, and each column corresponds to the metabolite identifier for the concentraiton value.

  • ensure_positive_percs – A list of reactions to calculate PERCs for, ensure they are postive, and update feasible models with the new PERC values. If None, no PERCs will be checked.

  • strategy (str, None) –

    The strategy for finding the steady state. Must be one of the following:

    • 'simulate'

    • 'nleq1'

    • 'nleq2'

    If a strategy is given, models must reach a steady state to be considered feasible. All feasible models are updated to steady state. If None, no attempts will be made to determine whether a generated model can reach a steady state.

  • perturbations (dict) –

    A dict of perturbations to incorporate into the simulation, or a list of perturbation dictionaries where each dict is applied to a simulation. Models must reach a steady state with all given pertubration dictionaries to be considered feasible. See simulation documentation for more information on valid perturbations.

    Ignored if strategy=None.

  • **kwargs

    solver_options :

    dict of options to pass to the solver utilized in determining a steady state. Solver options should be for the roadrunner.Integrator if strategy="simulate", otherwise options should correspond to the roadrunner.SteadyStateSolver.

    Default is None.

    verbose :

    bool indicating the verbosity of the function.

    Default is False.

    decimal_precision :

    bool indicating whether to apply the decimal_precision attribute of the MassConfiguration to the solution values.

    Default is False.

    flux_suffix :

    str representing the suffix to append to generated models indicating the flux data set used.

    Default is '_F'.

    conc_suffix :

    str representing the suffix to append to generated models indicating the conc data set used.

    Default is '_C'.

    at_equilibrium_default :

    float value to set the pseudo-order rate constant if the reaction is at equilibrium.

    Default is 100,000. Ignored if ensure_positive_percs=None.

    return_infeasible :

    bool indicating whether to generate and return an Ensemble containing the models deemed infeasible.

    Default is False.

Returns

  • feasible (list) – A list containing the MassModel objects that are deemed feasible by sucessfully passing through all PERC and simulation checks in the ensemble building processes.

  • infeasible (list) – A list containing the MassModel objects that are deemed infeasible by failing to passing through one of the PERC or simulation checks in the ensemble building processes.