{
"cells": [
{
"cell_type": "markdown",
"id": "fb270ec2-29da-4100-9485-39da5cce1663",
"metadata": {},
"source": [
"# Memory- and time-efficient solving of ME-models\n",
"\n",
"In this tutorial we will convert the ME-model object to an NLP mathematical representation to save memory and time in simulating many conditions."
]
},
{
"cell_type": "markdown",
"id": "d37fa4bb-3c8f-4a62-923c-6c1747ebb652",
"metadata": {},
"source": [
"## Import libraries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d048f35f-6a3b-4642-86cd-49e4c4e4a187",
"metadata": {
"ExecuteTime": {
"end_time": "2022-12-12T06:27:35.182100Z",
"start_time": "2022-12-12T06:27:35.157355Z"
}
},
"outputs": [],
"source": [
"import coralme"
]
},
{
"cell_type": "markdown",
"id": "051e8ee6-15ba-4017-8fc0-5f54da314cd8",
"metadata": {},
"source": [
"## Load"
]
},
{
"cell_type": "markdown",
"id": "77a83a0a-a0b6-4b74-b0b8-9331a34162d7",
"metadata": {},
"source": [
"Load the ME-model coming out of the Troubleshooter"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b0d86d92-b978-402c-b16a-439986ea2904",
"metadata": {
"ExecuteTime": {
"end_time": "2022-12-12T06:30:28.060280Z",
"start_time": "2022-12-12T06:30:01.706785Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Adding Metabolites into the ME-model... : 100.0%|██████████| 4630/ 4630 [00:00<00:00]\n",
"Adding ProcessData into the ME-model... : 100.0%|██████████| 4752/ 4752 [00:00<00:00]\n",
"Adding Reactions into the ME-model... : 100.0%|██████████| 7758/ 7758 [00:18<00:00]\n",
"Updating ME-model Reactions... : 100.0%|██████████| 6369/ 6369 [00:24<00:00]\n"
]
}
],
"source": [
"me = coralme.io.json.load_json_me_model(\"./bsubtilis/MEModel-step3-bsubtilis-TS.json\")"
]
},
{
"cell_type": "markdown",
"id": "a987ed37-a5c6-4d81-8899-ccc7c93d0f12",
"metadata": {},
"source": [
"## Convert to NLP problem"
]
},
{
"cell_type": "markdown",
"id": "3359d378-7f16-4ea7-878b-c66824953b2d",
"metadata": {},
"source": [
"The ME-model object *me* is a big object containing all data and metadata. This is not necessary when performing large-scale simulations, such as gene knockouts, or growth simulations under hundreds of conditions."
]
},
{
"cell_type": "markdown",
"id": "f70e7e58-793a-4f33-979c-8f1ffda5e0b3",
"metadata": {},
"source": [
"So, in these cases we only need the mathematical problem representing the ME-model, which is *nlp*."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "2b88842f-40b9-43df-adc1-5b2c28584880",
"metadata": {},
"outputs": [],
"source": [
"from coralme.solver.solver import ME_NLP\n",
"import cobra\n",
"\n",
"def get_nlp(model):\n",
" # Call construct LP problem function from model to get precursor objects.\n",
" # lamdify = True -> Creates lambdify functions to calculate bounds as a function of mu\n",
" # per_position = True -> LB and UB bounds as list of lambdify instead of a lambdify to \n",
" # be able to change individual values\n",
" Sf, Se, lb, ub, b, c, cs, atoms, lambdas, Lr, Lm = model.construct_lp_problem(lambdify=True,per_position=True)\n",
"\n",
" # Construct NLP object from precursor objects\n",
" me_nlp = ME_NLP(Sf, Se,b, c, lb, ub, cs, atoms, lambdas)\n",
" return me_nlp"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "bd633db0-44fb-4b1c-999c-d4d5fd68de07",
"metadata": {},
"outputs": [],
"source": [
"nlp = get_nlp(me)"
]
},
{
"cell_type": "markdown",
"id": "52b951d1-c296-48e6-a89f-437a0c4e5a6c",
"metadata": {},
"source": [
"## Retrieve metabolite and reaction indexes"
]
},
{
"cell_type": "markdown",
"id": "31ab4574-f389-4ee8-b7d4-6041546207b6",
"metadata": {},
"source": [
"The *nlp* now contains the mathematical representation, very similar to a struct object of the COBRA Toolbox in MATLAB. Similarly, reactions and metabolites are now accessed from integer indexes. We can create a dictionary from the original model to map reaction ids to indexes"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "eee7e9f8-06a7-40da-a74f-2bf8e0c00dc5",
"metadata": {},
"outputs": [],
"source": [
"rxn_index_dct = {r.id : me.reactions.index(r) for r in me.reactions}\n",
"met_index_dct = {m.id : me.metabolites.index(m) for m in me.metabolites}"
]
},
{
"cell_type": "markdown",
"id": "fe7e35a5-1cc9-4933-899d-cd1ee0617ef8",
"metadata": {},
"source": [
"From now on, *me* is no longer necessary and can be deleted to save memory usage. This is especially helpful when running parallelized simulations."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "a5685c80-4129-4964-b9f5-55fc3ab77b29",
"metadata": {},
"outputs": [],
"source": [
"# del me"
]
},
{
"cell_type": "markdown",
"id": "035c88fc-8b5b-4e16-8511-0819ca30240b",
"metadata": {},
"source": [
"## Solving the MEModel vs. NLP"
]
},
{
"cell_type": "markdown",
"id": "a351b8e3-6be9-45e0-bccf-4d234e2dc430",
"metadata": {},
"source": [
"Now we can call the modified *optimize* function in *helpers*. This function was modified from the me.optimize() function of a coralme.core.model.MEModel."
]
},
{
"cell_type": "markdown",
"id": "2454a482-30b7-4669-ba80-bff1f4ad4ad6",
"metadata": {},
"source": [
"Here you can see the speed-up when solving from scratch and solving from the NLP. The speed-up is even more noticeable with bigger models, as lamdifying a longer list of constraints will take much longer."
]
},
{
"cell_type": "markdown",
"id": "9e78f919-c105-423e-91fe-2a59e4224c50",
"metadata": {},
"source": [
"### ME-model"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "42db1888-6e48-407b-99f4-d3b8f826fa1b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The MINOS and quad MINOS solvers are a courtesy of Prof Michael A. Saunders. Please cite Ma, D., Yang, L., Fleming, R. et al. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression. Sci Rep 7, 40863 (2017). https://doi.org/10.1038/srep40863\n",
"\n",
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.1000000000000000\tOptimal\n",
" 2\t0.1500000000000000\tNot feasible\n",
" 3\t0.1250000000000000\tNot feasible\n",
" 4\t0.1125000000000000\tNot feasible\n",
" 5\t0.1062500000000000\tNot feasible\n",
" 6\t0.1031250000000000\tNot feasible\n",
" 7\t0.1015625000000000\tNot feasible\n",
" 8\t0.1007812500000000\tOptimal\n",
" 9\t0.1011718750000000\tOptimal\n",
" 10\t0.1013671875000000\tNot feasible\n",
" 11\t0.1012695312500000\tOptimal\n",
" 12\t0.1013183593750000\tNot feasible\n",
" 13\t0.1012939453125000\tNot feasible\n",
" 14\t0.1012817382812500\tNot feasible\n",
" 15\t0.1012756347656250\tOptimal\n",
" 16\t0.1012786865234375\tOptimal\n",
" 17\t0.1012802124023438\tNot feasible\n",
" 18\t0.1012794494628906\tOptimal\n",
"CPU times: user 1min 35s, sys: 36.7 ms, total: 1min 35s\n",
"Wall time: 1min 35s\n"
]
},
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"me.optimize(max_mu = 0.2, min_mu = 0., maxIter = 100, lambdify = True,\n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True)"
]
},
{
"cell_type": "markdown",
"id": "068f1486-90a9-45e2-affb-500b24029cb6",
"metadata": {},
"source": [
"### NLP"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "9c826512-de19-4d6e-b042-2b9109c81e54",
"metadata": {},
"outputs": [],
"source": [
"def optimize(rxn_index_dct,met_index_dct,me_nlp,max_mu = 2.8100561374051836, min_mu = 0., maxIter = 100,\n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True,basis=None):\n",
" muopt, xopt, yopt, zopt, basis, stat = me_nlp.bisectmu(\n",
"\t\t\t\tmumax = max_mu,\n",
"\t\t\t\tmumin = min_mu,\n",
"\t\t\t\tmaxIter = maxIter,\n",
"\t\t\t\ttolerance = tolerance,\n",
"\t\t\t\tprecision = precision,\n",
"\t\t\t\tverbose = verbose,\n",
" basis=basis)\n",
"\n",
" if stat == 'optimal':\n",
" #f = sum([ rxn.objective_coefficient * xopt[idx] for idx, rxn in enumerate(self.reactions) ])\n",
" x_primal = xopt[ 0:len(rxn_index_dct) ] # The remainder are the slacks\n",
" x_dict = { rxn : xopt[idx] for rxn,idx in rxn_index_dct.items() }\n",
" #y = pi\n",
" # J = [S; c]\n",
" y_dict = { met : yopt[idx] for met,idx in met_index_dct.items() }\n",
" z_dict = { rxn : zopt[idx] for rxn,idx in rxn_index_dct.items() }\n",
" #y_dict['linear_objective'] = y[len(y)-1]\n",
"\n",
" #self.me.solution = Solution(f, x_primal, x_dict, y, y_dict, 'qminos', time_elapsed, status)\n",
" return cobra.core.Solution(\n",
" objective_value = muopt,\n",
" status = stat,\n",
" fluxes = x_dict, # x_primal is a numpy.array with only fluxes info\n",
" reduced_costs = z_dict,\n",
" shadow_prices = y_dict,\n",
" ),basis\n",
" else:\n",
" return None,None"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c6d36926-e683-4f28-8bc4-2781b8286f05",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.1000000000000000\tOptimal\n",
" 2\t0.1500000000000000\tNot feasible\n",
" 3\t0.1250000000000000\tNot feasible\n",
" 4\t0.1125000000000000\tNot feasible\n",
" 5\t0.1062500000000000\tNot feasible\n",
" 6\t0.1031250000000000\tNot feasible\n",
" 7\t0.1015625000000000\tNot feasible\n",
" 8\t0.1007812500000000\tOptimal\n",
" 9\t0.1011718750000000\tOptimal\n",
" 10\t0.1013671875000000\tNot feasible\n",
" 11\t0.1012695312500000\tOptimal\n",
" 12\t0.1013183593750000\tNot feasible\n",
" 13\t0.1012939453125000\tNot feasible\n",
" 14\t0.1012817382812500\tNot feasible\n",
" 15\t0.1012756347656250\tOptimal\n",
" 16\t0.1012786865234375\tOptimal\n",
" 17\t0.1012802124023438\tNot feasible\n",
" 18\t0.1012794494628906\tOptimal\n",
"CPU times: user 1min 26s, sys: 36.2 ms, total: 1min 26s\n",
"Wall time: 1min 26s\n"
]
}
],
"source": [
"%%time\n",
"sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100,\n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = None)"
]
},
{
"cell_type": "markdown",
"id": "013dfedb-26fd-441e-a2ce-ba164a1677bb",
"metadata": {},
"source": [
"## Re-using the basis for even more speed-up"
]
},
{
"cell_type": "markdown",
"id": "a2c87cb1-bc77-4702-9c51-41b85686d7b9",
"metadata": {},
"source": [
"We can re-use a basis from a previously successful simulation to warm-start the first iteration and save even more time! "
]
},
{
"cell_type": "markdown",
"id": "6dc192e2-b722-44b2-8d26-b7c62345a3d2",
"metadata": {},
"source": [
"### Re-using basis"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8fd95e1d-68d4-4754-9fae-733d77c61a69",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.1000000000000000\tOptimal\n",
"CPU times: user 5.24 s, sys: 1 µs, total: 5.24 s\n",
"Wall time: 5.23 s\n"
]
}
],
"source": [
"%%time\n",
"sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1,\n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)"
]
},
{
"cell_type": "markdown",
"id": "d9e7b3c6-ab34-4445-9530-e07273131745",
"metadata": {},
"source": [
"### Cold start"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "30402490-99b2-42a8-a9cd-fde0b60a36e1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.1000000000000000\tOptimal\n",
"CPU times: user 45.8 s, sys: 600 µs, total: 45.8 s\n",
"Wall time: 45.7 s\n"
]
}
],
"source": [
"%%time\n",
"sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1, \n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = None)"
]
},
{
"cell_type": "markdown",
"id": "d6830ad0-e284-4b17-871a-f383e664e21e",
"metadata": {},
"source": [
"### Full calculation"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "4621b5b6-77eb-4794-acae-c3260d444d55",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.1000000000000000\tOptimal\n",
" 2\t0.1500000000000000\tNot feasible\n",
" 3\t0.1250000000000000\tNot feasible\n",
" 4\t0.1125000000000000\tNot feasible\n",
" 5\t0.1062500000000000\tNot feasible\n",
" 6\t0.1031250000000000\tNot feasible\n",
" 7\t0.1015625000000000\tNot feasible\n",
" 8\t0.1007812500000000\tOptimal\n",
" 9\t0.1011718750000000\tOptimal\n",
" 10\t0.1013671875000000\tNot feasible\n",
" 11\t0.1012695312500000\tOptimal\n",
" 12\t0.1013183593750000\tNot feasible\n",
" 13\t0.1012939453125000\tNot feasible\n",
" 14\t0.1012817382812500\tNot feasible\n",
" 15\t0.1012756347656250\tOptimal\n",
" 16\t0.1012786865234375\tOptimal\n",
" 17\t0.1012802124023438\tNot feasible\n",
" 18\t0.1012794494628906\tOptimal\n",
"CPU times: user 44.1 s, sys: 3.99 ms, total: 44.1 s\n",
"Wall time: 44.1 s\n"
]
}
],
"source": [
"%%time\n",
"sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100, \n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)"
]
},
{
"cell_type": "markdown",
"id": "05fa5361-666a-42f2-82d2-bb02ae65c617",
"metadata": {},
"source": [
"## Modifying the NLP"
]
},
{
"cell_type": "markdown",
"id": "2850d329-6531-45b4-9c64-1ac6cbfeaa47",
"metadata": {},
"source": [
"As previously mentioned, the NLP resembles a struct object of the COBRA Toolbox. The model is stored as a collection of vectors and matrices representing stoichiometries, bounds and other variables needed by the solvers."
]
},
{
"cell_type": "markdown",
"id": "fefd3b0a-e78c-4ac3-821c-1dbcf589b6d7",
"metadata": {},
"source": [
"The relevant properties are:\n",
"* **xu**: Upper bounds\n",
"* **xl**: Lower bounds\n",
"* **S**: Stoichiometric matrix (Metabolites x Reactions)"
]
},
{
"cell_type": "markdown",
"id": "57ac8eb0-4d56-4905-a485-cdac3da1c4b2",
"metadata": {},
"source": [
"The carbon source right now is Glucose, so we will change its bound to -10 to try to achieve maximum growth rate. \n",
"\n",
"**Note that bounds contain *lambdify* objects, not floats!**"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "d1fb881c-b573-4ea9-b6a1-01f54edd89d0",
"metadata": {},
"outputs": [],
"source": [
"nlp.xl[rxn_index_dct[\"EX_glc__D_e\"]] = lambda x:-10"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "b149a559-bb62-4383-b0b9-60d199cc75c6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration\t Solution to check\tSolver Status\n",
"---------\t------------------\t-------------\n",
" 1\t0.2500000000000000\tNot feasible\n",
" 2\t0.1250000000000000\tOptimal\n",
" 3\t0.1875000000000000\tOptimal\n",
" 4\t0.2187500000000000\tNot feasible\n",
" 5\t0.2031250000000000\tOptimal\n",
" 6\t0.2109375000000000\tNot feasible\n",
" 7\t0.2070312500000000\tOptimal\n",
" 8\t0.2089843750000000\tNot feasible\n",
" 9\t0.2080078125000000\tOptimal\n",
" 10\t0.2084960937500000\tOptimal\n",
" 11\t0.2087402343750000\tOptimal\n",
" 12\t0.2088623046875000\tNot feasible\n",
" 13\t0.2088012695312500\tNot feasible\n",
" 14\t0.2087707519531250\tOptimal\n",
" 15\t0.2087860107421875\tNot feasible\n",
" 16\t0.2087783813476562\tOptimal\n",
" 17\t0.2087821960449219\tOptimal\n",
" 18\t0.2087841033935547\tNot feasible\n",
" 19\t0.2087831497192383\tOptimal\n",
"CPU times: user 29.8 s, sys: 11.7 ms, total: 29.8 s\n",
"Wall time: 29.8 s\n"
]
}
],
"source": [
"%%time\n",
"sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.5, min_mu = 0., maxIter = 100, \n",
"\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)"
]
},
{
"cell_type": "markdown",
"id": "d1bdfce7-2056-42e5-8011-9149cbc5b433",
"metadata": {},
"source": [
"## Modifying the NLP from a dictionary of new bounds"
]
},
{
"cell_type": "markdown",
"id": "a69659e5-cc79-487d-9862-c4d51c4df66f",
"metadata": {},
"source": [
"Make sure to follow this method so that the lambda does not store pointers to a variable but rather a fixed constant (if that is what you want)."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "07189636-28f4-4651-976d-e8abf9a6c20b",
"metadata": {},
"outputs": [],
"source": [
"def set_exchanges(nlp,dct):\n",
" for k,v in dct.items():\n",
" nlp.xl[rxn_index_dct[k]] = lambda _,x=v:x"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "db1faac8-9fdd-430b-b724-c8c9f5dcfa11",
"metadata": {},
"outputs": [],
"source": [
"exchanges = {\n",
" \"EX_glc__D_e\" : -10,\n",
" \"EX_o2_e\" : -0.6,\n",
" \"EX_fru_e\" : -5,\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "cc1c75e9-66c7-469e-871e-af4d71ea0c38",
"metadata": {},
"outputs": [],
"source": [
"set_exchanges(nlp,exchanges)"
]
},
{
"cell_type": "markdown",
"id": "5ca78778-357e-4090-aa13-37e2582c34b4",
"metadata": {},
"source": [
"## Inspecting the solution"
]
},
{
"cell_type": "markdown",
"id": "cc497768-f7fe-416d-80b6-65080330dc35",
"metadata": {},
"source": [
"The function returns a cobra.Solution object just like the one stored in me.solution. For more details inspecting *sol*, refer to [Inspecting predicted fluxes](InspectingFluxes.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "02fe3bde-69cb-4318-964e-715a3f281111",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Optimal solution with objective value 0.209
| \n", " | fluxes | \n", "reduced_costs | \n", "
|---|---|---|
| biomass_dilution | \n", "2.087831e-01 | \n", "1.073665e-02 | \n", "
| BSU15140-MONOMER_to_generic_16Sm4Cm1402 | \n", "7.678233e-12 | \n", "-1.125814e-34 | \n", "
| RNA_BSU_rRNA_1_to_generic_16s_rRNAs | \n", "3.869460e-06 | \n", "1.226008e-33 | \n", "
| RNA_BSU_rRNA_16_to_generic_16s_rRNAs | \n", "0.000000e+00 | \n", "-1.174650e-02 | \n", "
| RNA_BSU_rRNA_19_to_generic_16s_rRNAs | \n", "1.445095e-06 | \n", "0.000000e+00 | \n", "
| ... | \n", "... | \n", "... | \n", "
| translocation_BSU25290_Plasma_Membrane | \n", "1.535647e-11 | \n", "-1.532963e-34 | \n", "
| translocation_BSU27650_Plasma_Membrane | \n", "4.123735e-06 | \n", "3.665935e-34 | \n", "
| translocation_BSU33630_Plasma_Membrane | \n", "4.575496e-06 | \n", "-5.894133e-35 | \n", "
| translocation_BSU35300_Plasma_Membrane | \n", "4.196604e-06 | \n", "0.000000e+00 | \n", "
| translocation_BSU41040_Plasma_Membrane | \n", "4.433779e-07 | \n", "0.000000e+00 | \n", "
7758 rows × 2 columns
\n", "