1. Getting Started with MASSpy
In this notebook example, objects essential to MASSpy are explored.
1.1. Models
In MASSpy, a model is represented by a mass.MassModel
object. MASSpy comes bundled with example models, including a “textbook” model\(^1\) of human red blood cell metabolism. To load this example model:
[1]:
from operator import attrgetter
import mass
import mass.example_data
model = mass.example_data.create_example_model("textbook")
Set parameter Username
Several attributes of the MassModel
, including model reactions and metabolites, are special types of lists that contain objects related to the model. Each specialized list is called a cobra.DictList
and is made up of the corresponding objects. For example, the reactions
attribute contains the mass.MassReaction
objects, and the metabolites
attribute contains the mass.MassMetabolite
objects.
[2]:
print("Number of metabolites: " + str(len(model.metabolites)))
print("Number of reactions: " + str(len(model.reactions)))
Number of metabolites: 68
Number of reactions: 76
When using a Jupyter notebook, this type of information is rendered as a table.
[3]:
model
[3]:
Name | RBC_PFK |
Memory address | 0x07fe87832e850 |
Stoichiometric Matrix | 68x76 |
Matrix Rank | 63 |
Number of metabolites | 68 |
Initial conditions defined | 68/68 |
Number of reactions | 76 |
Number of genes | 0 |
Number of enzyme modules | 1 |
Number of groups | 16 |
Objective expression | 0 |
Compartments | Cytosol |
Just like a regular list, objects in a cobra.DictList
can be retrieved by index. For example, to get the 30th reaction in the model (at index 29 because of 0-indexing):
[4]:
model.reactions[29]
[4]:
Reaction identifier | DPGase |
Name | Diphosphoglycerate phosphatase |
Memory address | 0x07fe8c9882910 |
Subsystem | Hemoglobin |
Kinetic Reversibility | False |
Stoichiometry |
_23dpg_c + h2o_c --> _3pg_c + pi_c 2,3-Disphospho-D-glycerate + H2O --> 3-Phospho-D-glycerate + Phosphate |
GPR | |
Bounds | (-1000.0, 1000.0) |
Items also can be retrieved by their id
attribute using the cobra.DictList.get_by_id()
method. For example, to get the cytosolic atp metabolite object with identifier “atp_c”:
[5]:
model.metabolites.get_by_id("atp_c")
[5]:
MassMetabolite identifier | atp_c |
Name | ATP |
Memory address | 0x07fe8c9834430 |
Formula | C10H12N5O13P3 |
Compartment | c |
Initial Condition | 1.2338626826140733 |
In 16 reaction(s) | PRPPS, PFK_R31, PFK_R11, ATPM, PFK_R21, PFK_T2, PFK_T4, PFK_T1, ADNK1, ADK1, PFK_R41, PYK, PFK_R01, HEX1, PGK, PFK_T3 |
If care is taken when assigning object identifiers (e.g., does not start with a number, does not contain certain characters such as “-“), it is possible to access objects inside of a cobra.DictList
as if they were attributes.
[6]:
print(model.reactions.DPGase)
DPGase: _23dpg_c + h2o_c --> _3pg_c + pi_c
To ensure all identifiers comply with Systems Biology Markup Language (SBML) and allow for interactive use, utilizing the identifiers from the BiGG Models database is highly recommended.
Guidelines for BiGG identifiers are found here.
1.2. Reactions
In MASSpy, a reaction is represented by a mass.MassReaction
object. A particular reaction can be retrieved by its id
using the cobra.DictList.get_by_id()
method. Below, the reaction with identifier “PGI” is inspected.
[7]:
PGI = model.reactions.get_by_id("PGI")
PGI
[7]:
Reaction identifier | PGI |
Name | Glucose-6-phosphate isomerase |
Memory address | 0x07fe8c9841760 |
Subsystem | Glycolysis |
Kinetic Reversibility | True |
Stoichiometry |
g6p_c <=> f6p_c D-Glucose 6-phosphate <=> D-Fructose 6-phosphate |
GPR | |
Bounds | (-1000.0, 1000.0) |
The full name and the chemical reaction are viewed as strings. If defined, the flux value for the reaction at steady state also can be viewed.
[8]:
print(PGI.name)
print(PGI.reaction)
print(PGI.steady_state_flux)
Glucose-6-phosphate isomerase
g6p_c <=> f6p_c
0.9098871145647632
The symbolic rate equation for the reaction is viewed using the rate
attribute. The rate is returned as a SymPy symbolic expression.
[9]:
print(PGI.rate)
kf_PGI*(g6p_c(t) - f6p_c(t)/Keq_PGI)
The above rate is considered a Type 1 rate law because of the reaction parameters used. There are three types of mass action rate laws that can be generated; * Type 1 rates utilize the forward rate and equilibrium constants. * Type 2 rates utilize the forward rate and reverse rate constants. * Type 3 rates utilize the equilibrium and reverse rate constants.
To view the reaction rate as a Type 2 rate equation, the get_mass_action_rate()
method can be used.
[10]:
print(PGI.get_mass_action_rate(rate_type=2))
kf_PGI*g6p_c(t) - kr_PGI*f6p_c(t)
Because the reaction has its reversible
attribute set as True
, the net mass action rate contains both forward and reverse rate components.
[11]:
print("Reversible: {0!r}".format(PGI.reversible))
print("Forward rate: {0!r}".format(
PGI.get_forward_mass_action_rate_expression(rate_type=2)))
print("Reverse rate: {0!r}".format(
PGI.get_reverse_mass_action_rate_expression(rate_type=2)))
print("Net reaction rate: {0!r}".format(
PGI.get_mass_action_rate(rate_type=2)))
Reversible: True
Forward rate: kf_PGI*g6p_c(t)
Reverse rate: kr_PGI*f6p_c(t)
Net reaction rate: kf_PGI*g6p_c(t) - kr_PGI*f6p_c(t)
To view the defined parameters for the rate and equilibrium constants, the parameters
attribute is used to return a dict
containing the parameter identifiers and their values.
[12]:
PGI.parameters
[12]:
{'kf_PGI': 2961.1111111111486, 'Keq_PGI': 0.41}
Parameter identifiers for a reaction can be obtained using various attributes.
[13]:
print(PGI.flux_symbol_str)
print(PGI.kf_str)
print(PGI.Keq_str)
print(PGI.kr_str)
v_PGI
kf_PGI
Keq_PGI
kr_PGI
Changing the reversible
attribute of the reaction affects the net rate equation:
[14]:
PGI.reversible = False
print("Reversible: {0!r}".format(PGI.reversible))
print("Net reaction rate: {0!r}".format(PGI.rate))
Reversible: False
Net reaction rate: kf_PGI*g6p_c(t)
Note that changing the reversible
attribute of the reaction can affect the parameters:
[15]:
PGI.parameters
[15]:
{'kf_PGI': 2961.1111111111486, 'Keq_PGI': inf}
The reaction can be checked for whether it is mass balanced using the check_mass_balance()
method. This method returns the elements that violate mass balance. If it comes back empty, then the reaction is mass balanced.
[16]:
PGI.check_mass_balance()
[16]:
{}
The add_metabolites()
method can be used to add metabolites to a reaction by passing in a dict
that contains the MassMetabolite
objects and their coefficients:
[17]:
PGI.add_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(PGI)
PGI: g6p_c + h_c --> f6p_c
The reaction is no longer mass balanced.
[18]:
PGI.subtract_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(PGI)
print(PGI.check_mass_balance())
PGI: g6p_c --> f6p_c
{}
1.3. Metabolites
In MASSpy, a metabolite is represented by a MassMetabolite
object. A particular metabolite can be retrieved by its id
using the cobra.DictList.get_by_id()
method. Below, the cytosolic glucose 6-phosphate metabolite with identifier “g6p_c” is inspected.
[19]:
g6p_c = model.metabolites.get_by_id("g6p_c")
g6p_c
[19]:
MassMetabolite identifier | g6p_c |
Name | D-Glucose 6-phosphate |
Memory address | 0x07fe8c97f8ca0 |
Formula | C6H11O9P |
Compartment | c |
Initial Condition | 0.16501847288094948 |
In 3 reaction(s) | G6PDH2r, HEX1, PGI |
The full name and the compartment where the metabolite is located (“c” for cytosol) are viewed as strings:
[20]:
print(g6p_c.name)
print(g6p_c.compartment)
D-Glucose 6-phosphate
c
The chemical formula and associated charge of the metabolite also can be viewed:
[21]:
print(g6p_c.formula)
print(g6p_c.charge)
C6H11O9P
-2
Reactions in which the metabolite participates are obtained as a frozenset
from the reactions
attribute. This can be used to count the number of reactions that utilize the metabolite.
[22]:
print("Number of reactions involving {0}: {1}".format(
g6p_c.id, len(g6p_c.reactions)))
Number of reactions involving g6p_c: 3
The ordinary differential equation (ODE), which represents the change in metabolite concentration over time, is determined by reactions that consume or produce the metabolite. The oridinary_differential_equation
attribute is used to view the current ODE for the metabolite. The ODE is returned as a symbolic expression.
[23]:
print(g6p_c.ordinary_differential_equation)
-kf_G6PDH2r*(g6p_c(t)*nadp_c(t) - _6pgl_c(t)*nadph_c(t)/Keq_G6PDH2r) + kf_HEX1*(atp_c(t)*glc__D_c(t) - adp_c(t)*g6p_c(t)/Keq_HEX1) - kf_PGI*g6p_c(t)
Numerical solutions are obtained by integrating ODEs. To integrate an ODE, a metabolite concentration at time \(t = 0\) must be defined as an initial condition. Initial conditions are accessed and changed using the initial_condition
attribute.
[24]:
g6p_c.initial_condition = 0.8
print(g6p_c.initial_condition)
0.8
Certain attributes have alias attribute accessors. For example, the initial_condition
and ordinary_differential_equation
attributes can be accessed via ic
and ode
, respectively.
[25]:
print(g6p_c.ic)
0.8
The fixed
attribute indicates whether the concentration of the metabolite is allowed to vary over time, with True
meaning that the metabolite’s initial condition is treated as a constant concentration value. Fixed metabolites have an ODE equal to 0.
[26]:
g6p_c.fixed = True
print(g6p_c.ode)
0
1.4. Additional Model Objects
The following are additional objects that are stored within a MassModel
. Unlike metabolites and reactions, which are essential for defining the system of ODEs, these objects are not always necessary for dynamic simulation of models.
However, these objects are still important to MASSpy and have a variety of uses, which include: aiding in the management of large models, sharing models among users, tracking additional information relevant to the system, and enabling various workflows for genome-scale kinetic models.
1.4.1. Genes
Because the mass.MassReaction
inherits from the cobra.Reaction
, MASSpy is also capable of handling genes and gene-protein-reaction (GPR) relationships. Note that MASSpy directly utilizes the cobra.Gene
object for the representation and management of genes.
The gene_reaction_rule
is a Boolean representation of the gene requirements for this reaction to be active [SQF+11].
GPRs are stored as the gene_reaction_rule
of reaction objects. Altering a gene_reaction_rule
will create new gene objects, if necessary, and update all relationships.
[27]:
PGI.gene_reaction_rule = "New_Gene"
PGI.gene_reaction_rule
[27]:
'New_Gene'
Gene
objects are returned from a reaction as a frozenset
:
[28]:
PGI.genes
[28]:
frozenset({<Gene New_Gene at 0x7fe8b99e9070>})
Newly created genes are added to the model upon creation. The gene objects are stored in a cobra.DictList
as a part of the genes
attribute. To access a gene from the model:
[29]:
new_gene = model.genes.get_by_id("New_Gene")
new_gene
[29]:
Gene identifier | New_Gene |
Name | |
Memory address | 0x07fe8b99e9070 |
Functional | True |
In 1 reaction(s) | PGI |
Gene objects are tracked by both its associated reaction objects and the the model. Changing a reaction’s gene_reaction_rule
may remove the gene’s association from the reaction, but it does not remove the gene from the model.
[30]:
PGI.gene_reaction_rule = ""
print("Reaction Genes: {0!r}".format(PGI.genes))
new_gene
Reaction Genes: frozenset()
[30]:
Gene identifier | New_Gene |
Name | |
Memory address | 0x07fe8b99e9070 |
Functional | True |
In 0 reaction(s) |
1.4.2. EnzymeModules
A mass.EnzymeModule
is a specialized MassModel
that represents a reconstruction of an enzyme’s mechanism. Upon merging an EnzymeModule
into a MassModel
, the EnzymeModule
is converted into a mass.EnzymeModuleDict
, a specialized dictionary object. The EnzymeModuleDict
is subsequently stored in a cobra.DictList
and is accessible through the enzyme_modules
attribute.
[31]:
PFK = model.enzyme_modules.get_by_id("PFK")
PFK
[31]:
Name | PFK |
Memory address | 0x07fe8a93ff940 |
Stoichiometric Matrix | 26x24 |
Matrix Rank | 20 |
Subsystem | Glycolysis |
Number of Ligands | 6 |
Number of EnzymeForms | 20 |
Number of EnzymeModuleReactions | 24 |
Enzyme Concentration Total | 3.3e-05 |
Enzyme Net Flux | 1.12 |
The process of creating an EnzymeModuleDict
preserves information stored in various attributes specific to the enzyme module and allows them to be accessed quickly after the merging process. Because the EnzymeModuleDict
inherits from an OrderedDict
, it has the same methods and behaviors.
[32]:
for key, value in PFK.items():
if isinstance(value, list):
print("{0}: {1}".format(key, len(value)))
enzyme_module_ligands: 6
enzyme_module_forms: 20
enzyme_module_reactions: 24
enzyme_module_ligands_categorized: 5
enzyme_module_forms_categorized: 5
enzyme_module_reactions_categorized: 6
EnzymeModuleDict
objects also can have their contents accessed by using its dict
keys as attribute accessors:
[33]:
print(PFK.id)
print(PFK.subsystem)
print(PFK.enzyme_concentration_total)
PFK
Glycolysis
3.3e-05
See the section on EnzymeModules for more information on working with EnzymeModule
and related objects.
1.4.3. Groups
Groups are objects for holding information regarding pathways, subsystems, or any custom grouping of objects within a MassModel
. MASSpy directly utilizes the cobra.Group
object, which are implemented based on the SBML Group specifications.
Group
objects are stored in a cobra.DictList
as the groups
attribute of the MassModel
.
[34]:
print("Number of groups: {0}".format(len(model.groups)))
Number of groups: 16
There are several different ways to work with group objects. One way that MASSpy utilizes group objects is to aid with categorizing and grouping various objects associated with EnzymeModule
objects.
For example, the group Products
contains all of metabolites that are the products of the reaction catalyzed by the PFK
enzyme. A set
containing the associated metabolite objects is returned by the members
attribute.
[35]:
products = model.groups.get_by_id("Products")
products.members
[35]:
[<MassMetabolite h_c at 0x7fe8c9834550>,
<MassMetabolite fdp_c at 0x7fe8c97f8d90>,
<MassMetabolite adp_c at 0x7fe8c9834400>]
Groups are also used to categorize reactions. For example, the group atp_c_binding
contains all of the reactions that represent the binding of ATP to the free active sites of the PFK enzyme.
[36]:
complexed_w_atp = model.groups.get_by_id("atp_c_binding")
for member in complexed_w_atp.members:
print(member)
PFK_R41: atp_c + pfk_R4_c <=> pfk_R4_A_c
PFK_R01: atp_c + pfk_R0_c <=> pfk_R0_A_c
PFK_R11: atp_c + pfk_R1_c <=> pfk_R1_A_c
PFK_R31: atp_c + pfk_R3_c <=> pfk_R3_A_c
PFK_R21: atp_c + pfk_R2_c <=> pfk_R2_A_c
Because groups are sets, members of groups are returned in no particular order. To maintain a consistent order, the sorted()
function is used with the attrgetter()
function from the operator
module to sort members by a particular object attribute.
[37]:
for member in sorted(complexed_w_atp.members, key=attrgetter("id")):
print(member)
PFK_R01: atp_c + pfk_R0_c <=> pfk_R0_A_c
PFK_R11: atp_c + pfk_R1_c <=> pfk_R1_A_c
PFK_R21: atp_c + pfk_R2_c <=> pfk_R2_A_c
PFK_R31: atp_c + pfk_R3_c <=> pfk_R3_A_c
PFK_R41: atp_c + pfk_R4_c <=> pfk_R4_A_c
1.4.4. Units
Unit
and UnitDefinition
objects are implemented, as per the SBML Unit and SBML UnitDefinition specifications. The primary purpose of these objects is to inform users of the model’s units, providing context to model values and observed results.
It is important to note that unit consistency is NOT checked by the MassModel
, meaning that it is incumbent upon users to maintain consistency of units and associated numerical values in a model.
UnitDefinition
objects are stored in a cobra.DictList
, accessible through the units
attribute.
[38]:
model.units
[38]:
[<UnitDefinition Millimolar "mM" at 0x7fe8c98a3eb0>,
<UnitDefinition hour "hr" at 0x7fe8c97f8c70>]
UnitDefinition
objects have identifiers and optional names. Therefore, a specific unit can be accessed using the get_by_id()
method.
[39]:
concentration_unit = model.units.get_by_id("mM")
print(concentration_unit.id)
print(concentration_unit.name)
mM
Millimolar
A UnitDefinition
is comprised of base units, which are stored in the list_of_units
attribute. Each unit must have a defined kind
, exponent
, scale
, and multiplier
.
[40]:
for unit in concentration_unit.list_of_units:
print(unit)
kind: litre; exponent: -1; scale: 0; multiplier: 1
kind: mole; exponent: 1; scale: -3; multiplier: 1
MASSpy contains some commonly defined Unit
objects to aid in the creation of UnitDefinition
objects, which are viewed using the print_defined_unit_values()
function.
[41]:
mass.core.units.print_defined_unit_values("Units")
╒════════════════════════════════════════════════════════════════════╕
│ Pre-defined Units │
╞════════════════════════════════════════════════════════════════════╡
│ Unit Definition │
│ ---------- ------------------------------------------------------ │
│ mole kind: mole; exponent: 1; scale: 0; multiplier: 1 │
│ millimole kind: mole; exponent: 1; scale: -3; multiplier: 1 │
│ litre kind: litre; exponent: 1; scale: 0; multiplier: 1 │
│ per_litre kind: litre; exponent: -1; scale: 0; multiplier: 1 │
│ second kind: second; exponent: 1; scale: 0; multiplier: 1 │
│ per_second kind: second; exponent: -1; scale: 0; multiplier: 1 │
│ hour kind: second; exponent: 1; scale: 0; multiplier: 3600 │
│ per_hour kind: second; exponent: -1; scale: 0; multiplier: 3600 │
│ per_gDW kind: gram; exponent: -1; scale: 0; multiplier: 1 │
╘════════════════════════════════════════════════════════════════════╛
\(^1\) The “textbook” model is created from Chapters 10-14 of [Pal11]