{ "cells": [ { "cell_type": "markdown", "id": "fb270ec2-29da-4100-9485-39da5cce1663", "metadata": {}, "source": [ "# Reconstructing with BioCyc information and curated files\n", "\n", "In this tutorial we will reconstruct a _Bacillus subtilis_ ME-model from just the input files." ] }, { "cell_type": "markdown", "id": "dbcc45ef-215d-4ee1-8383-bedd2efd084f", "metadata": {}, "source": [ "## Import libraries" ] }, { "cell_type": "code", "execution_count": 1, "id": "d048f35f-6a3b-4642-86cd-49e4c4e4a187", "metadata": { "ExecuteTime": { "end_time": "2022-12-12T06:27:35.182100Z", "start_time": "2022-12-12T06:27:35.157355Z" } }, "outputs": [], "source": [ "from coralme.builder.main import MEBuilder\n", "import coralme" ] }, { "cell_type": "markdown", "id": "f40390ac-ebd6-4c7c-88dc-fb7d5b276383", "metadata": {}, "source": [ "## Path to configuration files" ] }, { "cell_type": "markdown", "id": "fe8269c7-24de-4f65-b01f-3412883f238e", "metadata": {}, "source": [ "For more information about these files see [Description of inputs](BasicInputs.ipynb). The configuration and layout used here is as shown in [this *B. subtilis* ME-model repository](https://github.com/jdtibochab/coralme-models/tree/main/published/bsubtilis)." ] }, { "cell_type": "code", "execution_count": 2, "id": "d62ee683-429e-481b-87ea-8758c47886bb", "metadata": {}, "outputs": [], "source": [ "organism = './bsubtilis/organism.json'\n", "inputs = './bsubtilis/input.json'" ] }, { "cell_type": "markdown", "id": "962f97bf-a03d-432f-916d-be11877033c4", "metadata": {}, "source": [ "## Create MEBuilder instance" ] }, { "cell_type": "markdown", "id": "f9988c4f-9bc2-4716-b8f9-46b02d533b90", "metadata": {}, "source": [ "For more information about this class see [Architecture of coralME](coralMEArchitecture.ipynb)" ] }, { "cell_type": "code", "execution_count": 3, "id": "f83fbefc-c392-496d-9e1e-7820d1b6d60b", "metadata": { "tags": [] }, "outputs": [], "source": [ "builder = MEBuilder(*[organism, inputs])" ] }, { "cell_type": "markdown", "id": "32472ecb-9d39-477c-a611-3fb43d2e3254", "metadata": {}, "source": [ "## Generate files\n", "\n", "This corresponds to _Synchronyze_ and _Complement_ steps in [Architecture of coralME](coralMEArchitecture.ipynb)" ] }, { "cell_type": "code", "execution_count": 4, "id": "33e8a1a3-3206-4bc9-9407-3c31f5367d29", "metadata": { "ExecuteTime": { "end_time": "2022-12-12T06:30:28.060280Z", "start_time": "2022-12-12T06:30:01.706785Z" }, "scrolled": true, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initiating file processing...\n", "~ Processing files for bsubtilis...\n", "Set parameter Username\n", "Academic license - for non-commercial use only - expires 2025-09-03\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Checking M-model metabolites... : 100.0%|██████████| 990/ 990 [00:00<00:00]\n", "Checking M-model genes... : 100.0%|██████████| 844/ 844 [00:00<00:00]\n", "Checking M-model reactions... : 100.0%|██████████| 1250/ 1250 [00:00<00:00]\n", "Generating complexes dataframe from optional proteins file... : 100.0%|██████████| 4554/ 4554 [00:00<00:00]\n", "Syncing optional genes file... : 100.0%|██████████| 4541/ 4541 [00:00<00:00]\n", "Looking for duplicates within datasets... : 100.0%|██████████| 5/ 5 [00:00<00:00]\n", "Gathering ID occurrences across datasets... : 100.0%|██████████| 10647/10647 [00:00<00:00]\n", "Solving duplicates across datasets... : 0.0%| | 0/ 0 [00:00]\n", "Pruning GenBank... : 100.0%|██████████| 1/ 1 [00:01<00:00]\n", "Updating Genbank file with optional files... : 100.0%|██████████| 4541/ 4541 [00:00<00:00]\n", "Syncing optional files with genbank contigs... : 100.0%|██████████| 6/ 6 [00:04<00:00]\n", "Modifying metabolites with manual curation... : 0.0%| | 0/ 0 [00:00]\n", "Modifying metabolic reactions with manual curation... : 100.0%|██████████| 36/ 36 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "unknown metabolite '4crsol_c' created\n", "unknown metabolite 'dad__5_c' created\n", "unknown metabolite 'dhgly_c' created\n", "unknown metabolite '5drib_c' created\n", "unknown metabolite 'cu_e' created\n", "unknown metabolite 'cu_c' created\n", "unknown metabolite 'cbl1_e' created\n", "unknown metabolite 'cbl1_c' created\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Adding manual curation of complexes... : 100.0%|██████████| 438/ 438 [00:00<00:00]\n", "Getting sigma factors... : 100.0%|██████████| 38/ 38 [00:00<00:00]\n", "Getting generics from Genbank contigs... : 100.0%|██████████| 6/ 6 [00:00<00:00]\n", "Getting TU-gene associations from optional TUs file... : 100.0%|██████████| 1647/ 1647 [00:00<00:00]\n", "Adding protein location... : 100.0%|██████████| 4683/ 4683 [00:00<00:00]\n", "Purging M-model genes... : 100.0%|██████████| 851/ 851 [00:00<00:00]\n", "Getting enzyme-reaction associations... : 100.0%|██████████| 1262/ 1262 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Reading bsubtilis done.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Gathering M-model compartments... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Fixing compartments in M-model metabolites... : 100.0%|██████████| 998/ 998 [00:00<00:00]\n", "Fixing missing names in M-model reactions... : 100.0%|██████████| 1262/ 1262 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "~ Processing files for iJL1678b...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Checking M-model metabolites... : 100.0%|██████████| 1660/ 1660 [00:00<00:00]\n", "Checking M-model genes... : 100.0%|██████████| 1271/ 1271 [00:00<00:00]\n", "Checking M-model reactions... : 100.0%|██████████| 2377/ 2377 [00:00<00:00]\n", "Looking for duplicates within datasets... : 100.0%|██████████| 5/ 5 [00:00<00:00]\n", "Gathering ID occurrences across datasets... : 100.0%|██████████| 8517/ 8517 [00:00<00:00]\n", "Solving duplicates across datasets... : 0.0%| | 0/ 0 [00:00]\n", "Getting sigma factors... : 100.0%|██████████| 7/ 7 [00:00<00:00]\n", "Getting TU-gene associations from optional TUs file... : 100.0%|██████████| 1647/ 1647 [00:00<00:00]\n", "Adding protein location... : 100.0%|██████████| 1304/ 1304 [00:00<00:00]\n", "Purging M-model genes... : 100.0%|██████████| 1271/ 1271 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Reading iJL1678b done.\n", "~ Running BLAST with 1 threads...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Converting Genbank contigs to FASTA for BLAST... : 100.0%|██████████| 6/ 6 [00:00<00:00]\n", "Converting Genbank contigs to FASTA for BLAST... : 100.0%|██████████| 1/ 1 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "BLAST done.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Updating translocation machinery from homology... : 100.0%|██████████| 9/ 9 [00:00<00:00]\n", "Updating protein location from homology... : 100.0%|██████████| 514/ 514 [00:00<00:00]\n", "Updating translocation multipliers from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]\n", "Updating lipoprotein precursors from homology... : 100.0%|██████████| 14/ 14 [00:00<00:00]\n", "Updating cleaved-methionine proteins from homology... : 100.0%|██████████| 343/ 343 [00:00<00:00]\n", "Mapping M-metabolites to E-metabolites... : 100.0%|██████████| 147/ 147 [00:00<00:00]\n", "Updating generics from homology... : 100.0%|██████████| 10/ 10 [00:00<00:00]\n", "Updating folding from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Updating ribosome subreaction machinery from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]\n", "Updating tRNA synthetases from homology... : 100.0%|██████████| 20/ 20 [00:00<00:00]\n", "Updating peptide release factors from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]\n", "Updating transcription subreactions machinery from homology... : 100.0%|██████████| 4/ 4 [00:00<00:00]\n", "Updating translation initiation subreactions from homology... : 100.0%|██████████| 4/ 4 [00:00<00:00]\n", "Updating translation elongation subreactions from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Updating translation termination subreactions from homology... : 100.0%|██████████| 9/ 9 [00:00<00:00]\n", "Updating special tRNA subreactions from homology... : 100.0%|██████████| 1/ 1 [00:00<00:00]\n", "Updating RNA degradosome composition from homology... : 100.0%|██████████| 1/ 1 [00:00<00:00]\n", "Updating excision machinery from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]\n", "Updating tRNA subreactions from homology... : 100.0%|██████████| 6/ 6 [00:00<00:00]\n", "Updating lipid modification machinery from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Fixing M-model metabolites with homology... : 100.0%|██████████| 998/ 998 [00:00<00:00]\n", "Updating enzyme reaction association... : 100.0%|██████████| 922/ 922 [00:00<00:00]\n", "Getting tRNA to codon dictionary from NC_000964.3 : 100.0%|██████████| 4537/ 4537 [00:03<00:00]\n", "Getting tRNA to codon dictionary from BSU_misc_RNA_38 : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Getting tRNA to codon dictionary from G8J2-183 : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Getting tRNA to codon dictionary from G8J2-180 : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Getting tRNA to codon dictionary from G8J2-181 : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Getting tRNA to codon dictionary from G8J2-182 : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Checking defined translocation pathways... : 100.0%|██████████| 9/ 9 [00:00<00:00]\n", "Getting reaction Keffs... : 100.0%|██████████| 922/ 922 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Writting the Organism-Specific Matrix...\n", "Organism-Specific Matrix saved to ./bsubtilis/building_data/automated-org-with-refs.xlsx file.\n", "File processing done.\n" ] } ], "source": [ "builder.generate_files(overwrite=True)" ] }, { "cell_type": "markdown", "id": "0e9f5993-cbad-44e9-a1ce-51c43e69f565", "metadata": {}, "source": [ "## Build ME-model\n", "\n", "This corresponds to _Build_ in [Architecture of coralME](coralMEArchitecture.ipynb)" ] }, { "cell_type": "code", "execution_count": 5, "id": "df64d444-7726-4ab3-b597-74990636c428", "metadata": { "ExecuteTime": { "end_time": "2022-12-12T06:28:36.607889Z", "start_time": "2022-12-12T06:27:59.542579Z" }, "scrolled": true, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initiating ME-model reconstruction...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Adding biomass constraint(s) into the ME-model... : 100.0%|██████████| 11/ 11 [00:00<00:00]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Read LP format model from file /tmp/tmp7trh8uhg.lp\n", "Reading time = 0.00 seconds\n", ": 997 rows, 2524 columns, 10562 nonzeros\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Read LP format model from file /tmp/tmp575tct7k.lp\n", "Reading time = 0.00 seconds\n", ": 997 rows, 2520 columns, 10544 nonzeros\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Adding Metabolites from M-model into the ME-model... : 100.0%|██████████| 1087/ 1087 [00:00<00:00]\n", "Adding Reactions from M-model into the ME-model... : 100.0%|██████████| 1277/ 1277 [00:00<00:00]\n", "Adding Transcriptional Units into the ME-model from user input... : 100.0%|██████████| 1515/ 1515 [00:09<00:00]\n", "Adding features from contig NC_000964.3 into the ME-model... : 100.0%|██████████| 4537/ 4537 [00:09<00:00]\n", "Adding features from contig BSU_misc_RNA_38 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Adding features from contig G8J2-183 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Adding features from contig G8J2-180 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Adding features from contig G8J2-181 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Adding features from contig G8J2-182 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]\n", "Updating all TranslationReaction and TranscriptionReaction... : 100.0%|██████████| 8389/ 8389 [00:21<00:00]\n", "Removing SubReactions from ComplexData... : 100.0%|██████████| 5007/ 5007 [00:00<00:00]\n", "Adding ComplexFormation into the ME-model... : 100.0%|██████████| 5007/ 5007 [00:00<00:00]\n", "Adding Generic(s) into the ME-model... : 100.0%|██████████| 10/ 10 [00:00<00:00]\n", "Processing StoichiometricData in ME-model... : 100.0%|██████████| 1045/ 1045 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "ME-model was saved in the ./bsubtilis/ directory as MEModel-step1-bsubtilis.pkl\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Adding tRNA synthetase(s) information into the ME-model... : 100.0%|██████████| 306/ 306 [00:00<00:00]\n", "Adding tRNA modification SubReactions... : 100.0%|██████████| 19/ 19 [00:00<00:00]\n", "Associating tRNA modification enzyme(s) to tRNA(s)... : 100.0%|██████████| 33/ 33 [00:00<00:00]\n", "Adding SubReactions into TranslationReactions... : 100.0%|██████████| 4328/ 4328 [00:01<00:00]\n", "Adding RNA Polymerase(s) into the ME-model... : 100.0%|██████████| 39/ 39 [00:00<00:00]\n", "Associating a RNA Polymerase to each Transcriptional Unit... : 100.0%|██████████| 1515/ 1515 [00:00<00:00]\n", "Processing ComplexData in ME-model... : 100.0%|██████████| 322/ 322 [00:00<00:00]\n", "Adding ComplexFormation into the ME-model... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]\n", "Adding SubReactions into TranslationReactions... : 100.0%|██████████| 4328/ 4328 [00:08<00:00]\n", "Adding Transcription SubReactions... : 100.0%|██████████| 3513/ 3513 [00:00<00:00]\n", "Processing StoichiometricData in SubReactionData... : 100.0%|██████████| 1650/ 1650 [00:00<00:00]\n", "Adding reaction subsystems from M-model into the ME-model... : 100.0%|██████████| 1277/ 1277 [00:00<00:00]\n", "Processing StoichiometricData in ME-model... : 100.0%|██████████| 1045/ 1045 [00:00<00:00]\n", "Updating ME-model Reactions... : 100.0%|██████████| 17480/17480 [01:25<00:00]\n", "Updating all FormationReactions... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]\n", "Recalculation of the elemental contribution in SubReactions... : 100.0%|██████████| 277/ 277 [00:00<00:00]\n", "Updating all FormationReactions... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]\n", "Updating FormationReactions involving a lipoyl prosthetic group... : 100.0%|██████████| 3/ 3 [00:00<00:00]\n", "Updating FormationReactions involving a glycyl radical... : 0.0%| | 0/ 0 [00:00]\n", "Estimating effective turnover rates for reactions using the SASA method... : 100.0%|██████████| 2882/ 2882 [00:00<00:00]\n", "Mapping effective turnover rates from user input... : 0.0%| | 0/ 0 [00:00]\n", "Setting the effective turnover rates using user input... : 100.0%|██████████| 2632/ 2632 [00:00<00:00]\n", "Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 5054/ 5054 [00:06<00:00]\n", "Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00]\n", "Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 5647/ 5647 [00:01<00:00]\n", "Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 4640/ 4640 [00:09<00:00]\n", "Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 4540/ 4540 [00:05<00:00]\n", "Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 3513/ 3513 [00:03<00:00]\n", "Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 988/ 988 [00:00<00:00]\n", "Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00]\n", "Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1354/ 1354 [00:00<00:00]\n", "Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1354/ 1354 [00:00<00:00]\n", "Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1159/ 1159 [00:02<00:00]\n", "Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 792/ 792 [00:01<00:00]\n", "Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 965/ 965 [00:00<00:00]\n", "Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00]\n", "Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1352/ 1352 [00:00<00:00]\n", "Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1352/ 1352 [00:00<00:00]\n", "Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1157/ 1157 [00:01<00:00]\n", "Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 790/ 790 [00:01<00:00]\n", "Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 964/ 964 [00:00<00:00]\n", "Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00]\n", "Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1351/ 1351 [00:00<00:00]\n", "Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1351/ 1351 [00:00<00:00]\n", "Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1156/ 1156 [00:02<00:00]\n", "Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 789/ 789 [00:00<00:00]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "ME-model was saved in the ./bsubtilis/ directory as MEModel-step2-bsubtilis.pkl\n", "ME-model reconstruction is done.\n", "Number of metabolites in the ME-model is 4630 (+364.39%, from 997)\n", "Number of reactions in the ME-model is 7758 (+514.74%, from 1262)\n", "Number of genes in the ME-model is 1154 (+36.73%, from 844)\n", "Number of missing genes from reconstruction with homology, but no function is 104. Check the curation notes for more details.\n" ] } ], "source": [ "builder.build_me_model(overwrite=False)" ] }, { "cell_type": "markdown", "id": "eb86277a-16e5-490b-bcb2-99e0536e49da", "metadata": {}, "source": [ "## Troubleshoot ME-model\n", "\n", "This corresponds to _Find gaps_ in [Architecture of coralME](coralMEArchitecture.ipynb)" ] }, { "cell_type": "code", "execution_count": 6, "id": "49a10838-9065-402e-a364-0898f5a4d2ad", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The MINOS and quad MINOS solvers are a courtesy of Prof Michael A. Saunders. Please cite Ma, D., Yang, L., Fleming, R. et al. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression. Sci Rep 7, 40863 (2017). https://doi.org/10.1038/srep40863\n", "\n", "~ Troubleshooting started...\n", " Checking if the ME-model can simulate growth without gapfilling reactions...\n", " Original ME-model is feasible with a tested growth rate of 0.001000 1/h\n", "~ Final step. Fully optimizing with precision 1e-6 and save solution into the ME-model...\n", " Gapfilled ME-model is feasible with growth rate 0.101279 (M-model: 0.117966).\n", "ME-model was saved in the ./bsubtilis/ directory as MEModel-step3-bsubtilis-TS.pkl\n" ] } ], "source": [ "builder.troubleshoot(growth_key_and_value = { builder.me_model.mu : 0.001 },solver=\"qminos\")" ] }, { "cell_type": "markdown", "id": "cad22d76-a979-4ac8-a7ea-f1885d041c6a", "metadata": {}, "source": [ "