1. Getting Started with MASSpy

In this notebook example, objects essential to MASSpy are explored.

1.1. Models

In MASSpy, a model is represented by a mass.MassModel object. MASSpy comes bundled with example models, including a “textbook” model\(^1\) of human red blood cell metabolism. To load this example model:

[1]:
from operator import attrgetter

import mass
import mass.example_data

model = mass.example_data.create_example_model("textbook")
Set parameter Username

Several attributes of the MassModel, including model reactions and metabolites, are special types of lists that contain objects related to the model. Each specialized list is called a cobra.DictList and is made up of the corresponding objects. For example, the reactions attribute contains the mass.MassReaction objects, and the metabolites attribute contains the mass.MassMetabolite objects.

[2]:
print("Number of metabolites: " + str(len(model.metabolites)))
print("Number of reactions: " + str(len(model.reactions)))
Number of metabolites: 68
Number of reactions: 76

When using a Jupyter notebook, this type of information is rendered as a table.

[3]:
model
[3]:
NameRBC_PFK
Memory address0x07fe87832e850
Stoichiometric Matrix 68x76
Matrix Rank 63
Number of metabolites 68
Initial conditions defined 68/68
Number of reactions 76
Number of genes 0
Number of enzyme modules 1
Number of groups 16
Objective expression 0
Compartments Cytosol

Just like a regular list, objects in a cobra.DictList can be retrieved by index. For example, to get the 30th reaction in the model (at index 29 because of 0-indexing):

[4]:
model.reactions[29]
[4]:
Reaction identifier DPGase
Name Diphosphoglycerate phosphatase
Memory address 0x07fe8c9882910
Subsystem Hemoglobin
Kinetic Reversibility False
Stoichiometry

_23dpg_c + h2o_c --> _3pg_c + pi_c

2,3-Disphospho-D-glycerate + H2O --> 3-Phospho-D-glycerate + Phosphate

GPR
Bounds(-1000.0, 1000.0)

Items also can be retrieved by their id attribute using the cobra.DictList.get_by_id() method. For example, to get the cytosolic atp metabolite object with identifier “atp_c”:

[5]:
model.metabolites.get_by_id("atp_c")
[5]:
MassMetabolite identifier atp_c
Name ATP
Memory address 0x07fe8c9834430
Formula C10H12N5O13P3
Compartment c
Initial Condition 1.2338626826140733
In 16 reaction(s) PRPPS, PFK_R31, PFK_R11, ATPM, PFK_R21, PFK_T2, PFK_T4, PFK_T1, ADNK1, ADK1, PFK_R41, PYK, PFK_R01, HEX1, PGK, PFK_T3

If care is taken when assigning object identifiers (e.g., does not start with a number, does not contain certain characters such as “-“), it is possible to access objects inside of a cobra.DictList as if they were attributes.

[6]:
print(model.reactions.DPGase)
DPGase: _23dpg_c + h2o_c --> _3pg_c + pi_c

To ensure all identifiers comply with Systems Biology Markup Language (SBML) and allow for interactive use, utilizing the identifiers from the BiGG Models database is highly recommended.

Guidelines for BiGG identifiers are found here.

1.2. Reactions

In MASSpy, a reaction is represented by a mass.MassReaction object. A particular reaction can be retrieved by its id using the cobra.DictList.get_by_id() method. Below, the reaction with identifier “PGI” is inspected.

[7]:
PGI = model.reactions.get_by_id("PGI")
PGI
[7]:
Reaction identifier PGI
Name Glucose-6-phosphate isomerase
Memory address 0x07fe8c9841760
Subsystem Glycolysis
Kinetic Reversibility True
Stoichiometry

g6p_c <=> f6p_c

D-Glucose 6-phosphate <=> D-Fructose 6-phosphate

GPR
Bounds(-1000.0, 1000.0)

The full name and the chemical reaction are viewed as strings. If defined, the flux value for the reaction at steady state also can be viewed.

[8]:
print(PGI.name)
print(PGI.reaction)
print(PGI.steady_state_flux)
Glucose-6-phosphate isomerase
g6p_c <=> f6p_c
0.9098871145647632

The symbolic rate equation for the reaction is viewed using the rate attribute. The rate is returned as a SymPy symbolic expression.

[9]:
print(PGI.rate)
kf_PGI*(g6p_c(t) - f6p_c(t)/Keq_PGI)

The above rate is considered a Type 1 rate law because of the reaction parameters used. There are three types of mass action rate laws that can be generated; * Type 1 rates utilize the forward rate and equilibrium constants. * Type 2 rates utilize the forward rate and reverse rate constants. * Type 3 rates utilize the equilibrium and reverse rate constants.

To view the reaction rate as a Type 2 rate equation, the get_mass_action_rate() method can be used.

[10]:
print(PGI.get_mass_action_rate(rate_type=2))
kf_PGI*g6p_c(t) - kr_PGI*f6p_c(t)

Because the reaction has its reversible attribute set as True, the net mass action rate contains both forward and reverse rate components.

[11]:
print("Reversible: {0!r}".format(PGI.reversible))
print("Forward rate: {0!r}".format(
    PGI.get_forward_mass_action_rate_expression(rate_type=2)))
print("Reverse rate: {0!r}".format(
    PGI.get_reverse_mass_action_rate_expression(rate_type=2)))
print("Net reaction rate: {0!r}".format(
    PGI.get_mass_action_rate(rate_type=2)))
Reversible: True
Forward rate: kf_PGI*g6p_c(t)
Reverse rate: kr_PGI*f6p_c(t)
Net reaction rate: kf_PGI*g6p_c(t) - kr_PGI*f6p_c(t)

To view the defined parameters for the rate and equilibrium constants, the parameters attribute is used to return a dict containing the parameter identifiers and their values.

[12]:
PGI.parameters
[12]:
{'kf_PGI': 2961.1111111111486, 'Keq_PGI': 0.41}

Parameter identifiers for a reaction can be obtained using various attributes.

[13]:
print(PGI.flux_symbol_str)
print(PGI.kf_str)
print(PGI.Keq_str)
print(PGI.kr_str)
v_PGI
kf_PGI
Keq_PGI
kr_PGI

Changing the reversible attribute of the reaction affects the net rate equation:

[14]:
PGI.reversible = False
print("Reversible: {0!r}".format(PGI.reversible))
print("Net reaction rate: {0!r}".format(PGI.rate))
Reversible: False
Net reaction rate: kf_PGI*g6p_c(t)

Note that changing the reversible attribute of the reaction can affect the parameters:

[15]:
PGI.parameters
[15]:
{'kf_PGI': 2961.1111111111486, 'Keq_PGI': inf}

The reaction can be checked for whether it is mass balanced using the check_mass_balance() method. This method returns the elements that violate mass balance. If it comes back empty, then the reaction is mass balanced.

[16]:
PGI.check_mass_balance()
[16]:
{}

The add_metabolites() method can be used to add metabolites to a reaction by passing in a dict that contains the MassMetabolite objects and their coefficients:

[17]:
PGI.add_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(PGI)
PGI: g6p_c + h_c --> f6p_c

The reaction is no longer mass balanced.

[18]:
PGI.subtract_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(PGI)
print(PGI.check_mass_balance())
PGI: g6p_c --> f6p_c
{}

1.3. Metabolites

In MASSpy, a metabolite is represented by a MassMetabolite object. A particular metabolite can be retrieved by its id using the cobra.DictList.get_by_id() method. Below, the cytosolic glucose 6-phosphate metabolite with identifier “g6p_c” is inspected.

[19]:
g6p_c = model.metabolites.get_by_id("g6p_c")
g6p_c
[19]:
MassMetabolite identifier g6p_c
Name D-Glucose 6-phosphate
Memory address 0x07fe8c97f8ca0
Formula C6H11O9P
Compartment c
Initial Condition 0.16501847288094948
In 3 reaction(s) G6PDH2r, HEX1, PGI

The full name and the compartment where the metabolite is located (“c” for cytosol) are viewed as strings:

[20]:
print(g6p_c.name)
print(g6p_c.compartment)
D-Glucose 6-phosphate
c

The chemical formula and associated charge of the metabolite also can be viewed:

[21]:
print(g6p_c.formula)
print(g6p_c.charge)
C6H11O9P
-2

Reactions in which the metabolite participates are obtained as a frozenset from the reactions attribute. This can be used to count the number of reactions that utilize the metabolite.

[22]:
print("Number of reactions involving {0}: {1}".format(
    g6p_c.id, len(g6p_c.reactions)))
Number of reactions involving g6p_c: 3

The ordinary differential equation (ODE), which represents the change in metabolite concentration over time, is determined by reactions that consume or produce the metabolite. The oridinary_differential_equation attribute is used to view the current ODE for the metabolite. The ODE is returned as a symbolic expression.

[23]:
print(g6p_c.ordinary_differential_equation)
-kf_G6PDH2r*(g6p_c(t)*nadp_c(t) - _6pgl_c(t)*nadph_c(t)/Keq_G6PDH2r) + kf_HEX1*(atp_c(t)*glc__D_c(t) - adp_c(t)*g6p_c(t)/Keq_HEX1) - kf_PGI*g6p_c(t)

Numerical solutions are obtained by integrating ODEs. To integrate an ODE, a metabolite concentration at time \(t = 0\) must be defined as an initial condition. Initial conditions are accessed and changed using the initial_condition attribute.

[24]:
g6p_c.initial_condition = 0.8
print(g6p_c.initial_condition)
0.8

Certain attributes have alias attribute accessors. For example, the initial_condition and ordinary_differential_equation attributes can be accessed via ic and ode, respectively.

[25]:
print(g6p_c.ic)
0.8

The fixed attribute indicates whether the concentration of the metabolite is allowed to vary over time, with True meaning that the metabolite’s initial condition is treated as a constant concentration value. Fixed metabolites have an ODE equal to 0.

[26]:
g6p_c.fixed = True
print(g6p_c.ode)
0

1.4. Additional Model Objects

The following are additional objects that are stored within a MassModel. Unlike metabolites and reactions, which are essential for defining the system of ODEs, these objects are not always necessary for dynamic simulation of models.

However, these objects are still important to MASSpy and have a variety of uses, which include: aiding in the management of large models, sharing models among users, tracking additional information relevant to the system, and enabling various workflows for genome-scale kinetic models.

1.4.1. Genes

Because the mass.MassReaction inherits from the cobra.Reaction, MASSpy is also capable of handling genes and gene-protein-reaction (GPR) relationships. Note that MASSpy directly utilizes the cobra.Gene object for the representation and management of genes.

The gene_reaction_rule is a Boolean representation of the gene requirements for this reaction to be active [SQF+11].

GPRs are stored as the gene_reaction_rule of reaction objects. Altering a gene_reaction_rule will create new gene objects, if necessary, and update all relationships.

[27]:
PGI.gene_reaction_rule = "New_Gene"
PGI.gene_reaction_rule
[27]:
'New_Gene'

Gene objects are returned from a reaction as a frozenset:

[28]:
PGI.genes
[28]:
frozenset({<Gene New_Gene at 0x7fe8b99e9070>})

Newly created genes are added to the model upon creation. The gene objects are stored in a cobra.DictList as a part of the genes attribute. To access a gene from the model:

[29]:
new_gene = model.genes.get_by_id("New_Gene")
new_gene
[29]:
Gene identifierNew_Gene
Name
Memory address 0x07fe8b99e9070
FunctionalTrue
In 1 reaction(s) PGI

Gene objects are tracked by both its associated reaction objects and the the model. Changing a reaction’s gene_reaction_rule may remove the gene’s association from the reaction, but it does not remove the gene from the model.

[30]:
PGI.gene_reaction_rule = ""
print("Reaction Genes: {0!r}".format(PGI.genes))
new_gene
Reaction Genes: frozenset()
[30]:
Gene identifierNew_Gene
Name
Memory address 0x07fe8b99e9070
FunctionalTrue
In 0 reaction(s)

1.4.2. EnzymeModules

A mass.EnzymeModule is a specialized MassModel that represents a reconstruction of an enzyme’s mechanism. Upon merging an EnzymeModule into a MassModel, the EnzymeModule is converted into a mass.EnzymeModuleDict, a specialized dictionary object. The EnzymeModuleDict is subsequently stored in a cobra.DictList and is accessible through the enzyme_modules attribute.

[31]:
PFK = model.enzyme_modules.get_by_id("PFK")
PFK
[31]:
NamePFK
Memory address0x07fe8a93ff940
Stoichiometric Matrix 26x24
Matrix Rank 20
Subsystem Glycolysis
Number of Ligands 6
Number of EnzymeForms 20
Number of EnzymeModuleReactions 24
Enzyme Concentration Total 3.3e-05
Enzyme Net Flux 1.12

The process of creating an EnzymeModuleDict preserves information stored in various attributes specific to the enzyme module and allows them to be accessed quickly after the merging process. Because the EnzymeModuleDict inherits from an OrderedDict, it has the same methods and behaviors.

[32]:
for key, value in PFK.items():
    if isinstance(value, list):
        print("{0}: {1}".format(key, len(value)))
enzyme_module_ligands: 6
enzyme_module_forms: 20
enzyme_module_reactions: 24
enzyme_module_ligands_categorized: 5
enzyme_module_forms_categorized: 5
enzyme_module_reactions_categorized: 6

EnzymeModuleDict objects also can have their contents accessed by using its dict keys as attribute accessors:

[33]:
print(PFK.id)
print(PFK.subsystem)
print(PFK.enzyme_concentration_total)
PFK
Glycolysis
3.3e-05

See the section on EnzymeModules for more information on working with EnzymeModule and related objects.

1.4.3. Groups

Groups are objects for holding information regarding pathways, subsystems, or any custom grouping of objects within a MassModel. MASSpy directly utilizes the cobra.Group object, which are implemented based on the SBML Group specifications.

Group objects are stored in a cobra.DictList as the groups attribute of the MassModel.

[34]:
print("Number of groups: {0}".format(len(model.groups)))
Number of groups: 16

There are several different ways to work with group objects. One way that MASSpy utilizes group objects is to aid with categorizing and grouping various objects associated with EnzymeModule objects.

For example, the group Products contains all of metabolites that are the products of the reaction catalyzed by the PFK enzyme. A set containing the associated metabolite objects is returned by the members attribute.

[35]:
products = model.groups.get_by_id("Products")
products.members
[35]:
[<MassMetabolite h_c at 0x7fe8c9834550>,
 <MassMetabolite fdp_c at 0x7fe8c97f8d90>,
 <MassMetabolite adp_c at 0x7fe8c9834400>]

Groups are also used to categorize reactions. For example, the group atp_c_binding contains all of the reactions that represent the binding of ATP to the free active sites of the PFK enzyme.

[36]:
complexed_w_atp = model.groups.get_by_id("atp_c_binding")
for member in complexed_w_atp.members:
    print(member)
PFK_R41: atp_c + pfk_R4_c <=> pfk_R4_A_c
PFK_R01: atp_c + pfk_R0_c <=> pfk_R0_A_c
PFK_R11: atp_c + pfk_R1_c <=> pfk_R1_A_c
PFK_R31: atp_c + pfk_R3_c <=> pfk_R3_A_c
PFK_R21: atp_c + pfk_R2_c <=> pfk_R2_A_c

Because groups are sets, members of groups are returned in no particular order. To maintain a consistent order, the sorted() function is used with the attrgetter() function from the operator module to sort members by a particular object attribute.

[37]:
for member in sorted(complexed_w_atp.members, key=attrgetter("id")):
    print(member)
PFK_R01: atp_c + pfk_R0_c <=> pfk_R0_A_c
PFK_R11: atp_c + pfk_R1_c <=> pfk_R1_A_c
PFK_R21: atp_c + pfk_R2_c <=> pfk_R2_A_c
PFK_R31: atp_c + pfk_R3_c <=> pfk_R3_A_c
PFK_R41: atp_c + pfk_R4_c <=> pfk_R4_A_c

1.4.4. Units

Unit and UnitDefinition objects are implemented, as per the SBML Unit and SBML UnitDefinition specifications. The primary purpose of these objects is to inform users of the model’s units, providing context to model values and observed results.

It is important to note that unit consistency is NOT checked by the MassModel, meaning that it is incumbent upon users to maintain consistency of units and associated numerical values in a model.

UnitDefinition objects are stored in a cobra.DictList, accessible through the units attribute.

[38]:
model.units
[38]:
[<UnitDefinition Millimolar "mM" at 0x7fe8c98a3eb0>,
 <UnitDefinition hour "hr" at 0x7fe8c97f8c70>]

UnitDefinition objects have identifiers and optional names. Therefore, a specific unit can be accessed using the get_by_id() method.

[39]:
concentration_unit = model.units.get_by_id("mM")
print(concentration_unit.id)
print(concentration_unit.name)
mM
Millimolar

A UnitDefinition is comprised of base units, which are stored in the list_of_units attribute. Each unit must have a defined kind, exponent, scale, and multiplier.

[40]:
for unit in concentration_unit.list_of_units:
    print(unit)
kind: litre; exponent: -1; scale: 0; multiplier: 1
kind: mole; exponent: 1; scale: -3; multiplier: 1

MASSpy contains some commonly defined Unit objects to aid in the creation of UnitDefinition objects, which are viewed using the print_defined_unit_values() function.

[41]:
mass.core.units.print_defined_unit_values("Units")
╒════════════════════════════════════════════════════════════════════╕
│ Pre-defined Units                                                  │
╞════════════════════════════════════════════════════════════════════╡
│ Unit        Definition                                             │
│ ----------  ------------------------------------------------------ │
│ mole        kind: mole; exponent: 1; scale: 0; multiplier: 1       │
│ millimole   kind: mole; exponent: 1; scale: -3; multiplier: 1      │
│ litre       kind: litre; exponent: 1; scale: 0; multiplier: 1      │
│ per_litre   kind: litre; exponent: -1; scale: 0; multiplier: 1     │
│ second      kind: second; exponent: 1; scale: 0; multiplier: 1     │
│ per_second  kind: second; exponent: -1; scale: 0; multiplier: 1    │
│ hour        kind: second; exponent: 1; scale: 0; multiplier: 3600  │
│ per_hour    kind: second; exponent: -1; scale: 0; multiplier: 3600 │
│ per_gDW     kind: gram; exponent: -1; scale: 0; multiplier: 1      │
╘════════════════════════════════════════════════════════════════════╛

\(^1\) The “textbook” model is created from Chapters 10-14 of [Pal11]