10. Checking Model Quality

This notebook example demonstrates the various methods for ensuring quality and consistency in models. Here, the functions of the qcqa submodule are used to inspect a broken model and identify the issues that need attention.

[1]:
import mass.example_data

from mass import MassConfiguration
from mass.util import qcqa

model = mass.example_data.create_example_model("Model_To_Repair")
Set parameter Username

10.1. Inspecting a Model

To quickly identify all issues in a model, the qcqa_model() function of the qcqa submodule can be used. The function takes a MassModel and Booleans for various kwargs as input, identifies issues in the model based on the kwargs, and prints a report outlining possible issues.

[2]:
qcqa.qcqa_model(
    model,
    parameters=True,        # Check for undefined but necessary parameters in the model
    concentrations=True,    # Check for undefined but necessary concentrations in the model
    fluxes=True,            # Check for undefined steady state fluxes for reactions in the model
    superfluous=True,       # Check for excess parameters and ensure they are consistent.
    elemental=True,         # Check mass and charge balancing of reactions in the model
    simulation_only=True,  # Check for values necessary for simulation only
)
╒═══════════════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                             │
│ SIMULATABLE: False                                            │
│ PARAMETERS NUMERICALY CONSISTENT: False                       │
╞═══════════════════════════════════════════════════════════════╡
│ ============================================================= │
│                      MISSING PARAMETERS                       │
│ ============================================================= │
│ Reaction Parameters    Custom Parameters    S.S. Fluxes       │
│ ---------------------  -------------------  -------------     │
│ PGI: Keq; kf           PFK_R01: Keq_PFK_A   GAPD              │
│ PGK: kf                PFK_R11: Keq_PFK_A                     │
│ PGM: Keq               PFK_R21: Keq_PFK_A                     │
│                        PFK_R31: Keq_PFK_A                     │
│                        PFK_R41: Keq_PFK_A                     │
│ ============================================================= │
├───────────────────────────────────────────────────────────────┤
│ ============================================================= │
│                    MISSING CONCENTRATIONS                     │
│ ============================================================= │
│ Initial Conditions               Boundary Conditions          │
│ -------------------------------  ---------------------        │
│ glc__D_c (in HEX1, SK_glc__D_c)  h2o_b (in SK_h2o_c)          │
│ ============================================================= │
├───────────────────────────────────────────────────────────────┤
│ ============================================================= │
│                      CONSISTENCY CHECKS                       │
│ ============================================================= │
│ Superfluous Parameters    Elemental                           │
│ ------------------------  ---------------------------------   │
│ HEX1: Inconsistent        HEX1: {H: -3.0; O: -4.0; P: -1.0}   │
│ PYK: Consistent           PGI: {H: 3.0; O: 4.0; P: 1.0}       │
│                           G6PDH2r: {H: 3.0; O: 4.0; P: 1.0}   │
│                           DM_nadh: {charge: 2.0}              │
│                           GSHR: {charge: 2.0}                 │
│ ============================================================= │
╘═══════════════════════════════════════════════════════════════╛

The simulation_only kwarg as True ensures that identified missing values in the report (excluding steady state fluxes) are necessary for simulation. As seen above, there are a number of missing values and consistency issues that need to be addressed.

10.2. Identifying Missing Values

The report printed by the qcqa_model() function shows that there are a number of values in the model that have not yet been defined. Here, the functions of the qcqa submodule are used to retrieve the objects in the model that have missing values so that those values can be defined.

10.2.1. Missing parameters

To identify the reactions that have missing parameter values, the parameters flag is set as True. Reaction parameters for mass action rate laws (e.g., forward and reverse rate constants, equilibrium constants) and custom parameters for custom rates are checked for undefined numerical values.

[3]:
qcqa.qcqa_model(model, parameters=True)
╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: False                           │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│ ============================================ │
│             MISSING PARAMETERS               │
│ ============================================ │
│ Reaction Parameters    Custom Parameters     │
│ ---------------------  -------------------   │
│ PGI: Keq; kf           PFK_R01: Keq_PFK_A    │
│ PGK: kf                PFK_R11: Keq_PFK_A    │
│ PGM: Keq               PFK_R21: Keq_PFK_A    │
│                        PFK_R31: Keq_PFK_A    │
│                        PFK_R41: Keq_PFK_A    │
│ ============================================ │
╘══════════════════════════════════════════════╛

The report shows that the PGI, PGK, and PGM reactions are missing numerical values for forward rate and equilibrium constants. The get_missing_reaction_parameters() function is used to get these reaction objects from the model:

[4]:
qcqa.get_missing_reaction_parameters(model)
[4]:
{<MassReaction PGI at 0x7fdc709bba60>: 'Keq; kf',
 <MassReaction PGK at 0x7fdc709bba90>: 'kf',
 <MassReaction PGM at 0x7fdc709bbdf0>: 'Keq'}

The get_missing_reaction_parameters() function returns a dict that contains reaction objects and a string that indicates which parameters are missing. To get a subset of these reactions, a list of reaction identifiers is provided to the reaction_list argument. For example, to separate the reactions missing forward rate constants from those that are missing equilibrium constants:

[5]:
missing_kfs = qcqa.get_missing_reaction_parameters(model, reaction_list=["PGI", "PGK"])
missing_Keqs = qcqa.get_missing_reaction_parameters(model, reaction_list=["PGI", "PGM"])

print("Missing forward rate constants: {0!r}".format(list(missing_kfs)))
print("Missing equilibrium constants: {0!r}".format(list(missing_Keqs)))
Missing forward rate constants: [<MassReaction PGI at 0x7fdc709bba60>, <MassReaction PGK at 0x7fdc709bba90>]
Missing equilibrium constants: [<MassReaction PGI at 0x7fdc709bba60>, <MassReaction PGM at 0x7fdc709bbdf0>]

The get_missing_custom_parameters() function is used to identify missing custom parameters and the reactions that require them.

[6]:
qcqa.get_missing_custom_parameters(model)
[6]:
{<EnzymeModuleReaction PFK_R01 at 0x7fdc91447e20>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R11 at 0x7fdc91447100>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R21 at 0x7fdc9144ec40>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R31 at 0x7fdc91456520>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R41 at 0x7fdc91456dc0>: 'Keq_PFK_A'}

Once defined, the parameters no longer appear in the returned dict of missing values. A returned empty dict indicates that no undefined parameter values exist in the model.

[7]:
# Define missing parameters and update model
missing_parameters = {
    "kf_PGI": 2961.11, "Keq_PGI": 0.41,
    "kf_PGK": 1061655.085,
    "Keq_PGM": 0.147059,
    "Keq_PFK_A": 14.706}
model.update_parameters(missing_parameters)

print("Missing reaction parameters: {0!r}".format(qcqa.get_missing_reaction_parameters(model)))
print("Missing custom parameters: {0!r}".format(qcqa.get_missing_custom_parameters(model)))
Missing reaction parameters: {}
Missing custom parameters: {}

10.2.2. Missing fluxes

To identify the reactions that have missing steady state flux values, the fluxes kwarg is set as True.

[8]:
qcqa.qcqa_model(model, fluxes=True)
╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: False                           │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│ ============================================ │
│             MISSING PARAMETERS               │
│ ============================================ │
│ S.S. Fluxes                                  │
│ -------------                                │
│ GAPD                                         │
│ ============================================ │
╘══════════════════════════════════════════════╛

To get the reaction objects that are missing steady state fluxes, the get_missing_steady_state_fluxes() function is used. A returned empty list indicates that no undefined flux values exist in the model.

[9]:
missing_fluxes = qcqa.get_missing_steady_state_fluxes(model)
print("Before: {0!r}".format(missing_fluxes))

# Define missing flux value
missing_fluxes[0].steady_state_flux = 2.305

missing_fluxes = qcqa.get_missing_steady_state_fluxes(model)
print("After: {0!r}".format(missing_fluxes))
Before: [<MassReaction GAPD at 0x7fdc709bbf40>]
After: []

10.2.3. Missing concentrations

To identify the metabolites that have missing concentrations, the concentrations kwarg is set as True. Metabolite concentrations refer to the initial and boundary conditions of the model.

[10]:
qcqa.qcqa_model(model, concentrations=True)
╒══════════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                        │
│ SIMULATABLE: False                                       │
│ PARAMETERS NUMERICALY CONSISTENT: False                  │
╞══════════════════════════════════════════════════════════╡
│ ======================================================== │
│                 MISSING CONCENTRATIONS                   │
│ ======================================================== │
│ Initial Conditions               Boundary Conditions     │
│ -------------------------------  ---------------------   │
│ glc__D_c (in HEX1, SK_glc__D_c)  h2o_b (in SK_h2o_c)     │
│ ======================================================== │
╘══════════════════════════════════════════════════════════╛

The get_missing_initial_conditions() function is used to return a list of metabolite objects that have undefined initial conditions:

[11]:
missing_ics = qcqa.get_missing_initial_conditions(model)
print(missing_ics)
[<MassMetabolite glc__D_c at 0x7fdc709ab040>]

The get_missing_boundary_conditions() function is used to return a list of ‘boundary metabolites’ that have undefined boundary conditions. A ‘boundary metabolite’ is a proxy metabolite for a boundary condition not represented by MassMetabolite objects.

[12]:
qcqa.get_missing_boundary_conditions(model)
[12]:
['h2o_b']

Once defined, the metabolites no longer appear in the returned list. A returned empty list means no undefined metabolite concentrations were found.

[13]:
# Define missing initial condition
missing_ics[0].initial_condition = 1.3
# Define mising boundary condition
model.boundary_conditions["h2o_b"] = 1

# Check model to ensure they have been defined
print("Missing initial conditions: {0!r}".format(qcqa.get_missing_initial_conditions(model)))
print("Missing boundary conditions: {0!r}".format(qcqa.get_missing_boundary_conditions(model)))
Missing initial conditions: []
Missing boundary conditions: []

After defining the missing values, the report displayed by the qcqa_model() function shows that the model is simulatable. However, the model parameters are not considered numerically consistent, which may present some problems during the simulation process.

[14]:
qcqa.qcqa_model(model, parameters=True, concentrations=True, fluxes=True)
╒═══════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                         │
│ SIMULATABLE: True                         │
│ PARAMETERS NUMERICALY CONSISTENT: False   │
╞═══════════════════════════════════════════╡
╘═══════════════════════════════════════════╛

10.3. Consistency Checks

In addition to the undefined numerical values in the model, the initial report printed by the qcqa_model() function also indicates some issues in parameter consistency and elemental balancing. Here, the functions of the qcqa submodule are used to retrieve the objects in the model that have consistency issues so that they can be corrected.

10.3.1. Elemental

To identify the reactions that are not elementally balanced, the elemental kwarg is set as True. Note that pseudoreactions are typically unbalanced, and although boundary reactions are excluded by default, other pseudoreactions may exist in the system. In this model, the two pseudoreactions expected to be unbalanced are the DM_nadh and the GSHR reactions.

[15]:
qcqa.qcqa_model(model, elemental=True)
╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│ ============================================ │
│             CONSISTENCY CHECKS               │
│ ============================================ │
│ Elemental                                    │
│ ---------------------------------            │
│ HEX1: {H: -3.0; O: -4.0; P: -1.0}            │
│ PGI: {H: 3.0; O: 4.0; P: 1.0}                │
│ G6PDH2r: {H: 3.0; O: 4.0; P: 1.0}            │
│ DM_nadh: {charge: 2.0}                       │
│ GSHR: {charge: 2.0}                          │
│ ============================================ │
╘══════════════════════════════════════════════╛

As seen above, there are reactions other than the two expected pseudoreactions that appear in the printed report. Specifically, these are reactions with an imbalance in phosphoric acid (H3PO4). To get the imbalanced reaction objects, use the check_elemental_consistency() function.

[16]:
imbalanced_reactions = qcqa.check_elemental_consistency(
    model, reaction_list=["HEX1", "PGI", "G6PDH2r"])
imbalanced_reactions
[16]:
{<MassReaction HEX1 at 0x7fdc709bba30>: 'H: -3.0; O: -4.0; P: -1.0',
 <MassReaction PGI at 0x7fdc709bba60>: 'H: 3.0; O: 4.0; P: 1.0',
 <MassReaction G6PDH2r at 0x7fdc709c8a00>: 'H: 3.0; O: 4.0; P: 1.0'}

By looking at the reactions, their stoichiometries, and the unbalanced elements, it is clear that glucose 6-phosphate (G6P) is missing a phosphoric acid in its chemica formula.

[17]:
for reaction, unbalanced in imbalanced_reactions.items():
    print(reaction)

g6p_c = model.metabolites.get_by_id("g6p_c")
print("\n{0} formula before: {1}".format(g6p_c.id, repr(g6p_c.formula)))
HEX1: atp_c + glc__D_c <=> adp_c + g6p_c + h_c
PGI: g6p_c <=> f6p_c
G6PDH2r: g6p_c + nadp_c <=> _6pgl_c + h_c + nadph_c

g6p_c formula before: 'C6H8O5'

The current elemental composition of G6P is combined with the elemental composition of phosphoric acid:

[18]:
# Get existing formula composition
formula_composition = g6p_c.elements

# Update with the phosphoric acid
phosphoric_acid = {"H": 3, "P": 1, "O": 4}
for element, to_add in phosphoric_acid.items():
    if element in formula_composition:
        formula_composition[element] += to_add
    else:
        formula_composition[element] = to_add

# Change the existing formula to the new one
g6p_c.elements = formula_composition

print("{0} formula after: {1}".format(g6p_c.id, repr(g6p_c.formula)))
g6p_c formula after: 'C6H11O9P'

The reactions are no longer considered imbalanced.

[19]:
imbalanced_reactions = qcqa.check_elemental_consistency(
    model, reaction_list=["HEX1", "PGI", "G6PDH2r"])
imbalanced_reactions
[19]:
{}

10.3.2. Superfluous parameters

To identify the reactions with superfluous parameters, the superfluous kwarg is set as True. If a reaction has superfluous parameters, the parameters are checked to ensure that they are numerically consistent:

[20]:
qcqa.qcqa_model(model, superfluous=True)
╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│ ============================================ │
│             CONSISTENCY CHECKS               │
│ ============================================ │
│ Superfluous Parameters                       │
│ ------------------------                     │
│ HEX1: Inconsistent                           │
│ PYK: Consistent                              │
│ ============================================ │
╘══════════════════════════════════════════════╛

The pyruvate kinase reaction (PYK) contains a consistent superfluous parameter. A consistent superfluous parameter indicates that although an extra parameter is defined, the forward rate constant, reverse rate constant, and the equilibrium constant are numerically consistent with consistency being determined as \(|k_{f} / K_{eq} - k_{r}| \le tolerance\). The tolerance is determined by the decimal_precision of the MassConfiguration object (e.g., a decimal_precision of eight corresponds to rounding at the 8th digit right of the decimal, equivalent to \(|k_{f} / K_{eq} - k_{r}| \le 10^{-8}\).

[21]:
PYK = model.reactions.get_by_id("PYK")
print(abs(PYK.kf / PYK.Keq - PYK.kr))
0.0

The hexokinase reaction (HEX1) contains an inconsistent superfluous parameter:

[22]:
HEX1 = model.reactions.get_by_id("HEX1")
print(abs(HEX1.kf / HEX1.Keq - HEX1.kr))
10.0

Inconsistent superfluous parameters are quickly fixed by defining them as a consistent value, or ignored by setting the value as None.

[23]:
HEX1.kr = None
qcqa.qcqa_model(model, superfluous=True)
╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: True       │
╞══════════════════════════════════════════════╡
│ ============================================ │
│             CONSISTENCY CHECKS               │
│ ============================================ │
│ Superfluous Parameters                       │
│ ------------------------                     │
│ PYK: Consistent                              │
│ ============================================ │
╘══════════════════════════════════════════════╛

After addressing several of the model issues, the qcqa_model() function at the beginning of this notebook can be reused. This time, the report indicates that the model is elementally balanced and contains the numerical values necessary for simulation.

[24]:
qcqa.qcqa_model(
    model,
    parameters=True,        # Check for undefined but necessary parameters in the model
    concentrations=True,    # Check for undefined but necessary concentrations in the model
    fluxes=True,            # Check for undefined steady state fluxes for reactions in the model
    superfluous=True,       # Check for excess parameters and ensure they are consistent.
    elemental=True,         # Check mass and charge balancing of reactions in the model
)
╒════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                  │
│ SIMULATABLE: True                                  │
│ PARAMETERS NUMERICALY CONSISTENT: True             │
╞════════════════════════════════════════════════════╡
│ ================================================== │
│                CONSISTENCY CHECKS                  │
│ ================================================== │
│ Superfluous Parameters    Elemental                │
│ ------------------------  ----------------------   │
│ PYK: Consistent           DM_nadh: {charge: 2.0}   │
│                           GSHR: {charge: 2.0}      │
│ ================================================== │
╘════════════════════════════════════════════════════╛