9. Memory- and time-efficient solving of ME-models

In this tutorial we will convert the ME-model object to an NLP mathematical representation to save memory and time in simulating many conditions.

9.1. Import libraries

[1]:
import coralme

9.2. Load

Load the ME-model coming out of the Troubleshooter

[2]:
me = coralme.io.json.load_json_me_model("./bsubtilis/MEModel-step3-bsubtilis-TS.json")
Adding Metabolites into the ME-model...                                    : 100.0%|██████████|  4630/ 4630 [00:00<00:00]
Adding ProcessData into the ME-model...                                    : 100.0%|██████████|  4752/ 4752 [00:00<00:00]
Adding Reactions into the ME-model...                                      : 100.0%|██████████|  7758/ 7758 [00:18<00:00]
Updating ME-model Reactions...                                             : 100.0%|██████████|  6369/ 6369 [00:24<00:00]

9.3. Convert to NLP problem

The ME-model object me is a big object containing all data and metadata. This is not necessary when performing large-scale simulations, such as gene knockouts, or growth simulations under hundreds of conditions.

So, in these cases we only need the mathematical problem representing the ME-model, which is nlp.

[3]:
from coralme.solver.solver import ME_NLP
import cobra

def get_nlp(model):
    # Call construct LP problem function from model to get precursor objects.
    # lamdify = True -> Creates lambdify functions to calculate bounds as a function of mu
    # per_position = True -> LB and UB bounds as list of lambdify instead of a lambdify to
    # be able to change individual values
    Sf, Se, lb, ub, b, c, cs, atoms, lambdas, Lr, Lm = model.construct_lp_problem(lambdify=True,per_position=True)

    # Construct NLP object from precursor objects
    me_nlp = ME_NLP(Sf, Se,b, c, lb, ub,  cs, atoms, lambdas)
    return me_nlp
[4]:
nlp = get_nlp(me)

9.4. Retrieve metabolite and reaction indexes

The nlp now contains the mathematical representation, very similar to a struct object of the COBRA Toolbox in MATLAB. Similarly, reactions and metabolites are now accessed from integer indexes. We can create a dictionary from the original model to map reaction ids to indexes

[5]:
rxn_index_dct = {r.id : me.reactions.index(r) for r in me.reactions}
met_index_dct = {m.id : me.metabolites.index(m) for m in me.metabolites}

From now on, me is no longer necessary and can be deleted to save memory usage. This is especially helpful when running parallelized simulations.

[6]:
# del me

9.5. Solving the MEModel vs. NLP

Now we can call the modified optimize function in helpers. This function was modified from the me.optimize() function of a coralme.core.model.MEModel.

Here you can see the speed-up when solving from scratch and solving from the NLP. The speed-up is even more noticeable with bigger models, as lamdifying a longer list of constraints will take much longer.

9.5.1. ME-model

[7]:
%%time
me.optimize(max_mu = 0.2, min_mu = 0., maxIter = 100, lambdify = True,
            tolerance = 1e-6, precision = 'quad', verbose = True)
The MINOS and quad MINOS solvers are a courtesy of Prof Michael A. Saunders. Please cite Ma, D., Yang, L., Fleming, R. et al. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression. Sci Rep 7, 40863 (2017). https://doi.org/10.1038/srep40863

Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.1000000000000000      Optimal
        2       0.1500000000000000      Not feasible
        3       0.1250000000000000      Not feasible
        4       0.1125000000000000      Not feasible
        5       0.1062500000000000      Not feasible
        6       0.1031250000000000      Not feasible
        7       0.1015625000000000      Not feasible
        8       0.1007812500000000      Optimal
        9       0.1011718750000000      Optimal
       10       0.1013671875000000      Not feasible
       11       0.1012695312500000      Optimal
       12       0.1013183593750000      Not feasible
       13       0.1012939453125000      Not feasible
       14       0.1012817382812500      Not feasible
       15       0.1012756347656250      Optimal
       16       0.1012786865234375      Optimal
       17       0.1012802124023438      Not feasible
       18       0.1012794494628906      Optimal
CPU times: user 1min 35s, sys: 36.7 ms, total: 1min 35s
Wall time: 1min 35s
[7]:
True

9.5.2. NLP

[8]:
def optimize(rxn_index_dct,met_index_dct,me_nlp,max_mu = 2.8100561374051836, min_mu = 0., maxIter = 100,
            tolerance = 1e-6, precision = 'quad', verbose = True,basis=None):
    muopt, xopt, yopt, zopt, basis, stat = me_nlp.bisectmu(
                            mumax = max_mu,
                            mumin = min_mu,
                            maxIter = maxIter,
                            tolerance = tolerance,
                            precision = precision,
                            verbose = verbose,
                basis=basis)

    if stat == 'optimal':
        #f = sum([ rxn.objective_coefficient * xopt[idx] for idx, rxn in enumerate(self.reactions) ])
        x_primal = xopt[ 0:len(rxn_index_dct) ]   # The remainder are the slacks
        x_dict = { rxn : xopt[idx] for rxn,idx in rxn_index_dct.items() }
        #y = pi
        # J = [S; c]
        y_dict = { met : yopt[idx] for met,idx in met_index_dct.items() }
        z_dict = { rxn : zopt[idx] for rxn,idx in rxn_index_dct.items() }
        #y_dict['linear_objective'] = y[len(y)-1]

        #self.me.solution = Solution(f, x_primal, x_dict, y, y_dict, 'qminos', time_elapsed, status)
        return cobra.core.Solution(
            objective_value = muopt,
            status = stat,
            fluxes = x_dict, # x_primal is a numpy.array with only fluxes info
            reduced_costs = z_dict,
            shadow_prices = y_dict,
            ),basis
    else:
        return None,None
[9]:
%%time
sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100,
            tolerance = 1e-6, precision = 'quad', verbose = True, basis = None)
Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.1000000000000000      Optimal
        2       0.1500000000000000      Not feasible
        3       0.1250000000000000      Not feasible
        4       0.1125000000000000      Not feasible
        5       0.1062500000000000      Not feasible
        6       0.1031250000000000      Not feasible
        7       0.1015625000000000      Not feasible
        8       0.1007812500000000      Optimal
        9       0.1011718750000000      Optimal
       10       0.1013671875000000      Not feasible
       11       0.1012695312500000      Optimal
       12       0.1013183593750000      Not feasible
       13       0.1012939453125000      Not feasible
       14       0.1012817382812500      Not feasible
       15       0.1012756347656250      Optimal
       16       0.1012786865234375      Optimal
       17       0.1012802124023438      Not feasible
       18       0.1012794494628906      Optimal
CPU times: user 1min 26s, sys: 36.2 ms, total: 1min 26s
Wall time: 1min 26s

9.6. Re-using the basis for even more speed-up

We can re-use a basis from a previously successful simulation to warm-start the first iteration and save even more time!

9.6.1. Re-using basis

[10]:
%%time
sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1,
            tolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)
Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.1000000000000000      Optimal
CPU times: user 5.24 s, sys: 1 µs, total: 5.24 s
Wall time: 5.23 s

9.6.2. Cold start

[11]:
%%time
sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1,
            tolerance = 1e-6, precision = 'quad', verbose = True, basis = None)
Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.1000000000000000      Optimal
CPU times: user 45.8 s, sys: 600 µs, total: 45.8 s
Wall time: 45.7 s

9.6.3. Full calculation

[12]:
%%time
sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100,
            tolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)
Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.1000000000000000      Optimal
        2       0.1500000000000000      Not feasible
        3       0.1250000000000000      Not feasible
        4       0.1125000000000000      Not feasible
        5       0.1062500000000000      Not feasible
        6       0.1031250000000000      Not feasible
        7       0.1015625000000000      Not feasible
        8       0.1007812500000000      Optimal
        9       0.1011718750000000      Optimal
       10       0.1013671875000000      Not feasible
       11       0.1012695312500000      Optimal
       12       0.1013183593750000      Not feasible
       13       0.1012939453125000      Not feasible
       14       0.1012817382812500      Not feasible
       15       0.1012756347656250      Optimal
       16       0.1012786865234375      Optimal
       17       0.1012802124023438      Not feasible
       18       0.1012794494628906      Optimal
CPU times: user 44.1 s, sys: 3.99 ms, total: 44.1 s
Wall time: 44.1 s

9.7. Modifying the NLP

As previously mentioned, the NLP resembles a struct object of the COBRA Toolbox. The model is stored as a collection of vectors and matrices representing stoichiometries, bounds and other variables needed by the solvers.

The relevant properties are: * xu: Upper bounds * xl: Lower bounds * S: Stoichiometric matrix (Metabolites x Reactions)

The carbon source right now is Glucose, so we will change its bound to -10 to try to achieve maximum growth rate.

Note that bounds contain lambdify objects, not floats!

[13]:
nlp.xl[rxn_index_dct["EX_glc__D_e"]] = lambda x:-10
[14]:
%%time
sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.5, min_mu = 0., maxIter = 100,
            tolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)
Iteration        Solution to check      Solver Status
---------       ------------------      -------------
        1       0.2500000000000000      Not feasible
        2       0.1250000000000000      Optimal
        3       0.1875000000000000      Optimal
        4       0.2187500000000000      Not feasible
        5       0.2031250000000000      Optimal
        6       0.2109375000000000      Not feasible
        7       0.2070312500000000      Optimal
        8       0.2089843750000000      Not feasible
        9       0.2080078125000000      Optimal
       10       0.2084960937500000      Optimal
       11       0.2087402343750000      Optimal
       12       0.2088623046875000      Not feasible
       13       0.2088012695312500      Not feasible
       14       0.2087707519531250      Optimal
       15       0.2087860107421875      Not feasible
       16       0.2087783813476562      Optimal
       17       0.2087821960449219      Optimal
       18       0.2087841033935547      Not feasible
       19       0.2087831497192383      Optimal
CPU times: user 29.8 s, sys: 11.7 ms, total: 29.8 s
Wall time: 29.8 s

9.8. Modifying the NLP from a dictionary of new bounds

Make sure to follow this method so that the lambda does not store pointers to a variable but rather a fixed constant (if that is what you want).

[15]:
def set_exchanges(nlp,dct):
    for k,v in dct.items():
        nlp.xl[rxn_index_dct[k]] = lambda _,x=v:x
[16]:
exchanges = {
    "EX_glc__D_e" : -10,
    "EX_o2_e" : -0.6,
    "EX_fru_e" : -5,
}
[17]:
set_exchanges(nlp,exchanges)

9.9. Inspecting the solution

The function returns a cobra.Solution object just like the one stored in me.solution. For more details inspecting sol, refer to Inspecting predicted fluxes.

[18]:
sol
[18]:
Optimal solution with objective value 0.209
fluxes reduced_costs
biomass_dilution 2.087831e-01 1.073665e-02
BSU15140-MONOMER_to_generic_16Sm4Cm1402 7.678233e-12 -1.125814e-34
RNA_BSU_rRNA_1_to_generic_16s_rRNAs 3.869460e-06 1.226008e-33
RNA_BSU_rRNA_16_to_generic_16s_rRNAs 0.000000e+00 -1.174650e-02
RNA_BSU_rRNA_19_to_generic_16s_rRNAs 1.445095e-06 0.000000e+00
... ... ...
translocation_BSU25290_Plasma_Membrane 1.535647e-11 -1.532963e-34
translocation_BSU27650_Plasma_Membrane 4.123735e-06 3.665935e-34
translocation_BSU33630_Plasma_Membrane 4.575496e-06 -5.894133e-35
translocation_BSU35300_Plasma_Membrane 4.196604e-06 0.000000e+00
translocation_BSU41040_Plasma_Membrane 4.433779e-07 0.000000e+00

7758 rows × 2 columns