{ "cells": [ { "cell_type": "markdown", "id": "fb270ec2-29da-4100-9485-39da5cce1663", "metadata": {}, "source": [ "# Memory- and time-efficient solving of ME-models\n", "\n", "In this tutorial we will convert the ME-model object to an NLP mathematical representation to save memory and time in simulating many conditions." ] }, { "cell_type": "markdown", "id": "d37fa4bb-3c8f-4a62-923c-6c1747ebb652", "metadata": {}, "source": [ "## Import libraries" ] }, { "cell_type": "code", "execution_count": 1, "id": "d048f35f-6a3b-4642-86cd-49e4c4e4a187", "metadata": { "ExecuteTime": { "end_time": "2022-12-12T06:27:35.182100Z", "start_time": "2022-12-12T06:27:35.157355Z" } }, "outputs": [], "source": [ "import coralme" ] }, { "cell_type": "markdown", "id": "051e8ee6-15ba-4017-8fc0-5f54da314cd8", "metadata": {}, "source": [ "## Load" ] }, { "cell_type": "markdown", "id": "77a83a0a-a0b6-4b74-b0b8-9331a34162d7", "metadata": {}, "source": [ "Load the ME-model coming out of the Troubleshooter" ] }, { "cell_type": "code", "execution_count": 2, "id": "b0d86d92-b978-402c-b16a-439986ea2904", "metadata": { "ExecuteTime": { "end_time": "2022-12-12T06:30:28.060280Z", "start_time": "2022-12-12T06:30:01.706785Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Adding Metabolites into the ME-model... : 100.0%|██████████| 4630/ 4630 [00:00<00:00]\n", "Adding ProcessData into the ME-model... : 100.0%|██████████| 4752/ 4752 [00:00<00:00]\n", "Adding Reactions into the ME-model... : 100.0%|██████████| 7758/ 7758 [00:18<00:00]\n", "Updating ME-model Reactions... : 100.0%|██████████| 6369/ 6369 [00:24<00:00]\n" ] } ], "source": [ "me = coralme.io.json.load_json_me_model(\"./bsubtilis/MEModel-step3-bsubtilis-TS.json\")" ] }, { "cell_type": "markdown", "id": "a987ed37-a5c6-4d81-8899-ccc7c93d0f12", "metadata": {}, "source": [ "## Convert to NLP problem" ] }, { "cell_type": "markdown", "id": "3359d378-7f16-4ea7-878b-c66824953b2d", "metadata": {}, "source": [ "The ME-model object *me* is a big object containing all data and metadata. This is not necessary when performing large-scale simulations, such as gene knockouts, or growth simulations under hundreds of conditions." ] }, { "cell_type": "markdown", "id": "f70e7e58-793a-4f33-979c-8f1ffda5e0b3", "metadata": {}, "source": [ "So, in these cases we only need the mathematical problem representing the ME-model, which is *nlp*." ] }, { "cell_type": "code", "execution_count": 3, "id": "2b88842f-40b9-43df-adc1-5b2c28584880", "metadata": {}, "outputs": [], "source": [ "from coralme.solver.solver import ME_NLP\n", "import cobra\n", "\n", "def get_nlp(model):\n", " # Call construct LP problem function from model to get precursor objects.\n", " # lamdify = True -> Creates lambdify functions to calculate bounds as a function of mu\n", " # per_position = True -> LB and UB bounds as list of lambdify instead of a lambdify to \n", " # be able to change individual values\n", " Sf, Se, lb, ub, b, c, cs, atoms, lambdas, Lr, Lm = model.construct_lp_problem(lambdify=True,per_position=True)\n", "\n", " # Construct NLP object from precursor objects\n", " me_nlp = ME_NLP(Sf, Se,b, c, lb, ub, cs, atoms, lambdas)\n", " return me_nlp" ] }, { "cell_type": "code", "execution_count": 4, "id": "bd633db0-44fb-4b1c-999c-d4d5fd68de07", "metadata": {}, "outputs": [], "source": [ "nlp = get_nlp(me)" ] }, { "cell_type": "markdown", "id": "52b951d1-c296-48e6-a89f-437a0c4e5a6c", "metadata": {}, "source": [ "## Retrieve metabolite and reaction indexes" ] }, { "cell_type": "markdown", "id": "31ab4574-f389-4ee8-b7d4-6041546207b6", "metadata": {}, "source": [ "The *nlp* now contains the mathematical representation, very similar to a struct object of the COBRA Toolbox in MATLAB. Similarly, reactions and metabolites are now accessed from integer indexes. We can create a dictionary from the original model to map reaction ids to indexes" ] }, { "cell_type": "code", "execution_count": 5, "id": "eee7e9f8-06a7-40da-a74f-2bf8e0c00dc5", "metadata": {}, "outputs": [], "source": [ "rxn_index_dct = {r.id : me.reactions.index(r) for r in me.reactions}\n", "met_index_dct = {m.id : me.metabolites.index(m) for m in me.metabolites}" ] }, { "cell_type": "markdown", "id": "fe7e35a5-1cc9-4933-899d-cd1ee0617ef8", "metadata": {}, "source": [ "From now on, *me* is no longer necessary and can be deleted to save memory usage. This is especially helpful when running parallelized simulations." ] }, { "cell_type": "code", "execution_count": 6, "id": "a5685c80-4129-4964-b9f5-55fc3ab77b29", "metadata": {}, "outputs": [], "source": [ "# del me" ] }, { "cell_type": "markdown", "id": "035c88fc-8b5b-4e16-8511-0819ca30240b", "metadata": {}, "source": [ "## Solving the MEModel vs. NLP" ] }, { "cell_type": "markdown", "id": "a351b8e3-6be9-45e0-bccf-4d234e2dc430", "metadata": {}, "source": [ "Now we can call the modified *optimize* function in *helpers*. This function was modified from the me.optimize() function of a coralme.core.model.MEModel." ] }, { "cell_type": "markdown", "id": "2454a482-30b7-4669-ba80-bff1f4ad4ad6", "metadata": {}, "source": [ "Here you can see the speed-up when solving from scratch and solving from the NLP. The speed-up is even more noticeable with bigger models, as lamdifying a longer list of constraints will take much longer." ] }, { "cell_type": "markdown", "id": "9e78f919-c105-423e-91fe-2a59e4224c50", "metadata": {}, "source": [ "### ME-model" ] }, { "cell_type": "code", "execution_count": 7, "id": "42db1888-6e48-407b-99f4-d3b8f826fa1b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The MINOS and quad MINOS solvers are a courtesy of Prof Michael A. Saunders. Please cite Ma, D., Yang, L., Fleming, R. et al. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression. Sci Rep 7, 40863 (2017). https://doi.org/10.1038/srep40863\n", "\n", "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.1000000000000000\tOptimal\n", " 2\t0.1500000000000000\tNot feasible\n", " 3\t0.1250000000000000\tNot feasible\n", " 4\t0.1125000000000000\tNot feasible\n", " 5\t0.1062500000000000\tNot feasible\n", " 6\t0.1031250000000000\tNot feasible\n", " 7\t0.1015625000000000\tNot feasible\n", " 8\t0.1007812500000000\tOptimal\n", " 9\t0.1011718750000000\tOptimal\n", " 10\t0.1013671875000000\tNot feasible\n", " 11\t0.1012695312500000\tOptimal\n", " 12\t0.1013183593750000\tNot feasible\n", " 13\t0.1012939453125000\tNot feasible\n", " 14\t0.1012817382812500\tNot feasible\n", " 15\t0.1012756347656250\tOptimal\n", " 16\t0.1012786865234375\tOptimal\n", " 17\t0.1012802124023438\tNot feasible\n", " 18\t0.1012794494628906\tOptimal\n", "CPU times: user 1min 35s, sys: 36.7 ms, total: 1min 35s\n", "Wall time: 1min 35s\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "me.optimize(max_mu = 0.2, min_mu = 0., maxIter = 100, lambdify = True,\n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True)" ] }, { "cell_type": "markdown", "id": "068f1486-90a9-45e2-affb-500b24029cb6", "metadata": {}, "source": [ "### NLP" ] }, { "cell_type": "code", "execution_count": 8, "id": "9c826512-de19-4d6e-b042-2b9109c81e54", "metadata": {}, "outputs": [], "source": [ "def optimize(rxn_index_dct,met_index_dct,me_nlp,max_mu = 2.8100561374051836, min_mu = 0., maxIter = 100,\n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True,basis=None):\n", " muopt, xopt, yopt, zopt, basis, stat = me_nlp.bisectmu(\n", "\t\t\t\tmumax = max_mu,\n", "\t\t\t\tmumin = min_mu,\n", "\t\t\t\tmaxIter = maxIter,\n", "\t\t\t\ttolerance = tolerance,\n", "\t\t\t\tprecision = precision,\n", "\t\t\t\tverbose = verbose,\n", " basis=basis)\n", "\n", " if stat == 'optimal':\n", " #f = sum([ rxn.objective_coefficient * xopt[idx] for idx, rxn in enumerate(self.reactions) ])\n", " x_primal = xopt[ 0:len(rxn_index_dct) ] # The remainder are the slacks\n", " x_dict = { rxn : xopt[idx] for rxn,idx in rxn_index_dct.items() }\n", " #y = pi\n", " # J = [S; c]\n", " y_dict = { met : yopt[idx] for met,idx in met_index_dct.items() }\n", " z_dict = { rxn : zopt[idx] for rxn,idx in rxn_index_dct.items() }\n", " #y_dict['linear_objective'] = y[len(y)-1]\n", "\n", " #self.me.solution = Solution(f, x_primal, x_dict, y, y_dict, 'qminos', time_elapsed, status)\n", " return cobra.core.Solution(\n", " objective_value = muopt,\n", " status = stat,\n", " fluxes = x_dict, # x_primal is a numpy.array with only fluxes info\n", " reduced_costs = z_dict,\n", " shadow_prices = y_dict,\n", " ),basis\n", " else:\n", " return None,None" ] }, { "cell_type": "code", "execution_count": 9, "id": "c6d36926-e683-4f28-8bc4-2781b8286f05", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.1000000000000000\tOptimal\n", " 2\t0.1500000000000000\tNot feasible\n", " 3\t0.1250000000000000\tNot feasible\n", " 4\t0.1125000000000000\tNot feasible\n", " 5\t0.1062500000000000\tNot feasible\n", " 6\t0.1031250000000000\tNot feasible\n", " 7\t0.1015625000000000\tNot feasible\n", " 8\t0.1007812500000000\tOptimal\n", " 9\t0.1011718750000000\tOptimal\n", " 10\t0.1013671875000000\tNot feasible\n", " 11\t0.1012695312500000\tOptimal\n", " 12\t0.1013183593750000\tNot feasible\n", " 13\t0.1012939453125000\tNot feasible\n", " 14\t0.1012817382812500\tNot feasible\n", " 15\t0.1012756347656250\tOptimal\n", " 16\t0.1012786865234375\tOptimal\n", " 17\t0.1012802124023438\tNot feasible\n", " 18\t0.1012794494628906\tOptimal\n", "CPU times: user 1min 26s, sys: 36.2 ms, total: 1min 26s\n", "Wall time: 1min 26s\n" ] } ], "source": [ "%%time\n", "sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100,\n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = None)" ] }, { "cell_type": "markdown", "id": "013dfedb-26fd-441e-a2ce-ba164a1677bb", "metadata": {}, "source": [ "## Re-using the basis for even more speed-up" ] }, { "cell_type": "markdown", "id": "a2c87cb1-bc77-4702-9c51-41b85686d7b9", "metadata": {}, "source": [ "We can re-use a basis from a previously successful simulation to warm-start the first iteration and save even more time! " ] }, { "cell_type": "markdown", "id": "6dc192e2-b722-44b2-8d26-b7c62345a3d2", "metadata": {}, "source": [ "### Re-using basis" ] }, { "cell_type": "code", "execution_count": 10, "id": "8fd95e1d-68d4-4754-9fae-733d77c61a69", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.1000000000000000\tOptimal\n", "CPU times: user 5.24 s, sys: 1 µs, total: 5.24 s\n", "Wall time: 5.23 s\n" ] } ], "source": [ "%%time\n", "sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1,\n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)" ] }, { "cell_type": "markdown", "id": "d9e7b3c6-ab34-4445-9530-e07273131745", "metadata": {}, "source": [ "### Cold start" ] }, { "cell_type": "code", "execution_count": 11, "id": "30402490-99b2-42a8-a9cd-fde0b60a36e1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.1000000000000000\tOptimal\n", "CPU times: user 45.8 s, sys: 600 µs, total: 45.8 s\n", "Wall time: 45.7 s\n" ] } ], "source": [ "%%time\n", "sol,_ = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 1, \n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = None)" ] }, { "cell_type": "markdown", "id": "d6830ad0-e284-4b17-871a-f383e664e21e", "metadata": {}, "source": [ "### Full calculation" ] }, { "cell_type": "code", "execution_count": 12, "id": "4621b5b6-77eb-4794-acae-c3260d444d55", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.1000000000000000\tOptimal\n", " 2\t0.1500000000000000\tNot feasible\n", " 3\t0.1250000000000000\tNot feasible\n", " 4\t0.1125000000000000\tNot feasible\n", " 5\t0.1062500000000000\tNot feasible\n", " 6\t0.1031250000000000\tNot feasible\n", " 7\t0.1015625000000000\tNot feasible\n", " 8\t0.1007812500000000\tOptimal\n", " 9\t0.1011718750000000\tOptimal\n", " 10\t0.1013671875000000\tNot feasible\n", " 11\t0.1012695312500000\tOptimal\n", " 12\t0.1013183593750000\tNot feasible\n", " 13\t0.1012939453125000\tNot feasible\n", " 14\t0.1012817382812500\tNot feasible\n", " 15\t0.1012756347656250\tOptimal\n", " 16\t0.1012786865234375\tOptimal\n", " 17\t0.1012802124023438\tNot feasible\n", " 18\t0.1012794494628906\tOptimal\n", "CPU times: user 44.1 s, sys: 3.99 ms, total: 44.1 s\n", "Wall time: 44.1 s\n" ] } ], "source": [ "%%time\n", "sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.2, min_mu = 0., maxIter = 100, \n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)" ] }, { "cell_type": "markdown", "id": "05fa5361-666a-42f2-82d2-bb02ae65c617", "metadata": {}, "source": [ "## Modifying the NLP" ] }, { "cell_type": "markdown", "id": "2850d329-6531-45b4-9c64-1ac6cbfeaa47", "metadata": {}, "source": [ "As previously mentioned, the NLP resembles a struct object of the COBRA Toolbox. The model is stored as a collection of vectors and matrices representing stoichiometries, bounds and other variables needed by the solvers." ] }, { "cell_type": "markdown", "id": "fefd3b0a-e78c-4ac3-821c-1dbcf589b6d7", "metadata": {}, "source": [ "The relevant properties are:\n", "* **xu**: Upper bounds\n", "* **xl**: Lower bounds\n", "* **S**: Stoichiometric matrix (Metabolites x Reactions)" ] }, { "cell_type": "markdown", "id": "57ac8eb0-4d56-4905-a485-cdac3da1c4b2", "metadata": {}, "source": [ "The carbon source right now is Glucose, so we will change its bound to -10 to try to achieve maximum growth rate. \n", "\n", "**Note that bounds contain *lambdify* objects, not floats!**" ] }, { "cell_type": "code", "execution_count": 13, "id": "d1fb881c-b573-4ea9-b6a1-01f54edd89d0", "metadata": {}, "outputs": [], "source": [ "nlp.xl[rxn_index_dct[\"EX_glc__D_e\"]] = lambda x:-10" ] }, { "cell_type": "code", "execution_count": 14, "id": "b149a559-bb62-4383-b0b9-60d199cc75c6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration\t Solution to check\tSolver Status\n", "---------\t------------------\t-------------\n", " 1\t0.2500000000000000\tNot feasible\n", " 2\t0.1250000000000000\tOptimal\n", " 3\t0.1875000000000000\tOptimal\n", " 4\t0.2187500000000000\tNot feasible\n", " 5\t0.2031250000000000\tOptimal\n", " 6\t0.2109375000000000\tNot feasible\n", " 7\t0.2070312500000000\tOptimal\n", " 8\t0.2089843750000000\tNot feasible\n", " 9\t0.2080078125000000\tOptimal\n", " 10\t0.2084960937500000\tOptimal\n", " 11\t0.2087402343750000\tOptimal\n", " 12\t0.2088623046875000\tNot feasible\n", " 13\t0.2088012695312500\tNot feasible\n", " 14\t0.2087707519531250\tOptimal\n", " 15\t0.2087860107421875\tNot feasible\n", " 16\t0.2087783813476562\tOptimal\n", " 17\t0.2087821960449219\tOptimal\n", " 18\t0.2087841033935547\tNot feasible\n", " 19\t0.2087831497192383\tOptimal\n", "CPU times: user 29.8 s, sys: 11.7 ms, total: 29.8 s\n", "Wall time: 29.8 s\n" ] } ], "source": [ "%%time\n", "sol,basis = optimize(rxn_index_dct,met_index_dct,nlp,max_mu = 0.5, min_mu = 0., maxIter = 100, \n", "\t\ttolerance = 1e-6, precision = 'quad', verbose = True, basis = basis)" ] }, { "cell_type": "markdown", "id": "d1bdfce7-2056-42e5-8011-9149cbc5b433", "metadata": {}, "source": [ "## Modifying the NLP from a dictionary of new bounds" ] }, { "cell_type": "markdown", "id": "a69659e5-cc79-487d-9862-c4d51c4df66f", "metadata": {}, "source": [ "Make sure to follow this method so that the lambda does not store pointers to a variable but rather a fixed constant (if that is what you want)." ] }, { "cell_type": "code", "execution_count": 15, "id": "07189636-28f4-4651-976d-e8abf9a6c20b", "metadata": {}, "outputs": [], "source": [ "def set_exchanges(nlp,dct):\n", " for k,v in dct.items():\n", " nlp.xl[rxn_index_dct[k]] = lambda _,x=v:x" ] }, { "cell_type": "code", "execution_count": 16, "id": "db1faac8-9fdd-430b-b724-c8c9f5dcfa11", "metadata": {}, "outputs": [], "source": [ "exchanges = {\n", " \"EX_glc__D_e\" : -10,\n", " \"EX_o2_e\" : -0.6,\n", " \"EX_fru_e\" : -5,\n", "}" ] }, { "cell_type": "code", "execution_count": 17, "id": "cc1c75e9-66c7-469e-871e-af4d71ea0c38", "metadata": {}, "outputs": [], "source": [ "set_exchanges(nlp,exchanges)" ] }, { "cell_type": "markdown", "id": "5ca78778-357e-4090-aa13-37e2582c34b4", "metadata": {}, "source": [ "## Inspecting the solution" ] }, { "cell_type": "markdown", "id": "cc497768-f7fe-416d-80b6-65080330dc35", "metadata": {}, "source": [ "The function returns a cobra.Solution object just like the one stored in me.solution. For more details inspecting *sol*, refer to [Inspecting predicted fluxes](InspectingFluxes.ipynb)." ] }, { "cell_type": "code", "execution_count": 18, "id": "02fe3bde-69cb-4318-964e-715a3f281111", "metadata": {}, "outputs": [ { "data": { "text/html": [ "Optimal solution with objective value 0.209
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fluxesreduced_costs
biomass_dilution2.087831e-011.073665e-02
BSU15140-MONOMER_to_generic_16Sm4Cm14027.678233e-12-1.125814e-34
RNA_BSU_rRNA_1_to_generic_16s_rRNAs3.869460e-061.226008e-33
RNA_BSU_rRNA_16_to_generic_16s_rRNAs0.000000e+00-1.174650e-02
RNA_BSU_rRNA_19_to_generic_16s_rRNAs1.445095e-060.000000e+00
.........
translocation_BSU25290_Plasma_Membrane1.535647e-11-1.532963e-34
translocation_BSU27650_Plasma_Membrane4.123735e-063.665935e-34
translocation_BSU33630_Plasma_Membrane4.575496e-06-5.894133e-35
translocation_BSU35300_Plasma_Membrane4.196604e-060.000000e+00
translocation_BSU41040_Plasma_Membrane4.433779e-070.000000e+00
\n", "

7758 rows × 2 columns

\n", "
" ], "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sol" ] } ], "metadata": { "kernelspec": { "display_name": "coralme-dev", "language": "python", "name": "coralme-dev" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" }, "toc-autonumbering": true, "toc-showcode": true }, "nbformat": 4, "nbformat_minor": 5 }