5. Reconstructing with BioCyc information and curated files
In this tutorial we will reconstruct a Bacillus subtilis ME-model from just the input files.
5.1. Import libraries
[1]:
from coralme.builder.main import MEBuilder
import coralme
5.2. Path to configuration files
For more information about these files see Description of inputs. The configuration and layout used here is as shown in this B. subtilis ME-model repository.
[2]:
organism = './bsubtilis/organism.json'
inputs = './bsubtilis/input.json'
5.3. Create MEBuilder instance
For more information about this class see Architecture of coralME
[3]:
builder = MEBuilder(*[organism, inputs])
5.4. Generate files
This corresponds to Synchronyze and Complement steps in Architecture of coralME
[4]:
builder.generate_files(overwrite=True)
Initiating file processing...
~ Processing files for bsubtilis...
Set parameter Username
Academic license - for non-commercial use only - expires 2025-09-03
Checking M-model metabolites... : 100.0%|██████████| 990/ 990 [00:00<00:00]
Checking M-model genes... : 100.0%|██████████| 844/ 844 [00:00<00:00]
Checking M-model reactions... : 100.0%|██████████| 1250/ 1250 [00:00<00:00]
Generating complexes dataframe from optional proteins file... : 100.0%|██████████| 4554/ 4554 [00:00<00:00]
Syncing optional genes file... : 100.0%|██████████| 4541/ 4541 [00:00<00:00]
Looking for duplicates within datasets... : 100.0%|██████████| 5/ 5 [00:00<00:00]
Gathering ID occurrences across datasets... : 100.0%|██████████| 10647/10647 [00:00<00:00]
Solving duplicates across datasets... : 0.0%| | 0/ 0 [00:00<?]
Pruning GenBank... : 100.0%|██████████| 1/ 1 [00:01<00:00]
Updating Genbank file with optional files... : 100.0%|██████████| 4541/ 4541 [00:00<00:00]
Syncing optional files with genbank contigs... : 100.0%|██████████| 6/ 6 [00:04<00:00]
Modifying metabolites with manual curation... : 0.0%| | 0/ 0 [00:00<?]
Modifying metabolic reactions with manual curation... : 100.0%|██████████| 36/ 36 [00:00<00:00]
unknown metabolite '4crsol_c' created
unknown metabolite 'dad__5_c' created
unknown metabolite 'dhgly_c' created
unknown metabolite '5drib_c' created
unknown metabolite 'cu_e' created
unknown metabolite 'cu_c' created
unknown metabolite 'cbl1_e' created
unknown metabolite 'cbl1_c' created
Adding manual curation of complexes... : 100.0%|██████████| 438/ 438 [00:00<00:00]
Getting sigma factors... : 100.0%|██████████| 38/ 38 [00:00<00:00]
Getting generics from Genbank contigs... : 100.0%|██████████| 6/ 6 [00:00<00:00]
Getting TU-gene associations from optional TUs file... : 100.0%|██████████| 1647/ 1647 [00:00<00:00]
Adding protein location... : 100.0%|██████████| 4683/ 4683 [00:00<00:00]
Purging M-model genes... : 100.0%|██████████| 851/ 851 [00:00<00:00]
Getting enzyme-reaction associations... : 100.0%|██████████| 1262/ 1262 [00:00<00:00]
Reading bsubtilis done.
Gathering M-model compartments... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Fixing compartments in M-model metabolites... : 100.0%|██████████| 998/ 998 [00:00<00:00]
Fixing missing names in M-model reactions... : 100.0%|██████████| 1262/ 1262 [00:00<00:00]
~ Processing files for iJL1678b...
Checking M-model metabolites... : 100.0%|██████████| 1660/ 1660 [00:00<00:00]
Checking M-model genes... : 100.0%|██████████| 1271/ 1271 [00:00<00:00]
Checking M-model reactions... : 100.0%|██████████| 2377/ 2377 [00:00<00:00]
Looking for duplicates within datasets... : 100.0%|██████████| 5/ 5 [00:00<00:00]
Gathering ID occurrences across datasets... : 100.0%|██████████| 8517/ 8517 [00:00<00:00]
Solving duplicates across datasets... : 0.0%| | 0/ 0 [00:00<?]
Getting sigma factors... : 100.0%|██████████| 7/ 7 [00:00<00:00]
Getting TU-gene associations from optional TUs file... : 100.0%|██████████| 1647/ 1647 [00:00<00:00]
Adding protein location... : 100.0%|██████████| 1304/ 1304 [00:00<00:00]
Purging M-model genes... : 100.0%|██████████| 1271/ 1271 [00:00<00:00]
Reading iJL1678b done.
~ Running BLAST with 1 threads...
Converting Genbank contigs to FASTA for BLAST... : 100.0%|██████████| 6/ 6 [00:00<00:00]
Converting Genbank contigs to FASTA for BLAST... : 100.0%|██████████| 1/ 1 [00:00<00:00]
BLAST done.
Updating translocation machinery from homology... : 100.0%|██████████| 9/ 9 [00:00<00:00]
Updating protein location from homology... : 100.0%|██████████| 514/ 514 [00:00<00:00]
Updating translocation multipliers from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]
Updating lipoprotein precursors from homology... : 100.0%|██████████| 14/ 14 [00:00<00:00]
Updating cleaved-methionine proteins from homology... : 100.0%|██████████| 343/ 343 [00:00<00:00]
Mapping M-metabolites to E-metabolites... : 100.0%|██████████| 147/ 147 [00:00<00:00]
Updating generics from homology... : 100.0%|██████████| 10/ 10 [00:00<00:00]
Updating folding from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Updating ribosome subreaction machinery from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]
Updating tRNA synthetases from homology... : 100.0%|██████████| 20/ 20 [00:00<00:00]
Updating peptide release factors from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]
Updating transcription subreactions machinery from homology... : 100.0%|██████████| 4/ 4 [00:00<00:00]
Updating translation initiation subreactions from homology... : 100.0%|██████████| 4/ 4 [00:00<00:00]
Updating translation elongation subreactions from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Updating translation termination subreactions from homology... : 100.0%|██████████| 9/ 9 [00:00<00:00]
Updating special tRNA subreactions from homology... : 100.0%|██████████| 1/ 1 [00:00<00:00]
Updating RNA degradosome composition from homology... : 100.0%|██████████| 1/ 1 [00:00<00:00]
Updating excision machinery from homology... : 100.0%|██████████| 3/ 3 [00:00<00:00]
Updating tRNA subreactions from homology... : 100.0%|██████████| 6/ 6 [00:00<00:00]
Updating lipid modification machinery from homology... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Fixing M-model metabolites with homology... : 100.0%|██████████| 998/ 998 [00:00<00:00]
Updating enzyme reaction association... : 100.0%|██████████| 922/ 922 [00:00<00:00]
Getting tRNA to codon dictionary from NC_000964.3 : 100.0%|██████████| 4537/ 4537 [00:03<00:00]
Getting tRNA to codon dictionary from BSU_misc_RNA_38 : 100.0%|██████████| 2/ 2 [00:00<00:00]
Getting tRNA to codon dictionary from G8J2-183 : 100.0%|██████████| 2/ 2 [00:00<00:00]
Getting tRNA to codon dictionary from G8J2-180 : 100.0%|██████████| 2/ 2 [00:00<00:00]
Getting tRNA to codon dictionary from G8J2-181 : 100.0%|██████████| 2/ 2 [00:00<00:00]
Getting tRNA to codon dictionary from G8J2-182 : 100.0%|██████████| 2/ 2 [00:00<00:00]
Checking defined translocation pathways... : 100.0%|██████████| 9/ 9 [00:00<00:00]
Getting reaction Keffs... : 100.0%|██████████| 922/ 922 [00:00<00:00]
Writting the Organism-Specific Matrix...
Organism-Specific Matrix saved to ./bsubtilis/building_data/automated-org-with-refs.xlsx file.
File processing done.
5.5. Build ME-model
This corresponds to Build in Architecture of coralME
[5]:
builder.build_me_model(overwrite=False)
Initiating ME-model reconstruction...
Adding biomass constraint(s) into the ME-model... : 100.0%|██████████| 11/ 11 [00:00<00:00]
Read LP format model from file /tmp/tmp7trh8uhg.lp
Reading time = 0.00 seconds
: 997 rows, 2524 columns, 10562 nonzeros
Read LP format model from file /tmp/tmp575tct7k.lp
Reading time = 0.00 seconds
: 997 rows, 2520 columns, 10544 nonzeros
Adding Metabolites from M-model into the ME-model... : 100.0%|██████████| 1087/ 1087 [00:00<00:00]
Adding Reactions from M-model into the ME-model... : 100.0%|██████████| 1277/ 1277 [00:00<00:00]
Adding Transcriptional Units into the ME-model from user input... : 100.0%|██████████| 1515/ 1515 [00:09<00:00]
Adding features from contig NC_000964.3 into the ME-model... : 100.0%|██████████| 4537/ 4537 [00:09<00:00]
Adding features from contig BSU_misc_RNA_38 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Adding features from contig G8J2-183 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Adding features from contig G8J2-180 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Adding features from contig G8J2-181 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Adding features from contig G8J2-182 into the ME-model... : 100.0%|██████████| 2/ 2 [00:00<00:00]
Updating all TranslationReaction and TranscriptionReaction... : 100.0%|██████████| 8389/ 8389 [00:21<00:00]
Removing SubReactions from ComplexData... : 100.0%|██████████| 5007/ 5007 [00:00<00:00]
Adding ComplexFormation into the ME-model... : 100.0%|██████████| 5007/ 5007 [00:00<00:00]
Adding Generic(s) into the ME-model... : 100.0%|██████████| 10/ 10 [00:00<00:00]
Processing StoichiometricData in ME-model... : 100.0%|██████████| 1045/ 1045 [00:00<00:00]
ME-model was saved in the ./bsubtilis/ directory as MEModel-step1-bsubtilis.pkl
Adding tRNA synthetase(s) information into the ME-model... : 100.0%|██████████| 306/ 306 [00:00<00:00]
Adding tRNA modification SubReactions... : 100.0%|██████████| 19/ 19 [00:00<00:00]
Associating tRNA modification enzyme(s) to tRNA(s)... : 100.0%|██████████| 33/ 33 [00:00<00:00]
Adding SubReactions into TranslationReactions... : 100.0%|██████████| 4328/ 4328 [00:01<00:00]
Adding RNA Polymerase(s) into the ME-model... : 100.0%|██████████| 39/ 39 [00:00<00:00]
Associating a RNA Polymerase to each Transcriptional Unit... : 100.0%|██████████| 1515/ 1515 [00:00<00:00]
Processing ComplexData in ME-model... : 100.0%|██████████| 322/ 322 [00:00<00:00]
Adding ComplexFormation into the ME-model... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]
Adding SubReactions into TranslationReactions... : 100.0%|██████████| 4328/ 4328 [00:08<00:00]
Adding Transcription SubReactions... : 100.0%|██████████| 3513/ 3513 [00:00<00:00]
Processing StoichiometricData in SubReactionData... : 100.0%|██████████| 1650/ 1650 [00:00<00:00]
Adding reaction subsystems from M-model into the ME-model... : 100.0%|██████████| 1277/ 1277 [00:00<00:00]
Processing StoichiometricData in ME-model... : 100.0%|██████████| 1045/ 1045 [00:00<00:00]
Updating ME-model Reactions... : 100.0%|██████████| 17480/17480 [01:25<00:00]
Updating all FormationReactions... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]
Recalculation of the elemental contribution in SubReactions... : 100.0%|██████████| 277/ 277 [00:00<00:00]
Updating all FormationReactions... : 100.0%|██████████| 5054/ 5054 [00:00<00:00]
Updating FormationReactions involving a lipoyl prosthetic group... : 100.0%|██████████| 3/ 3 [00:00<00:00]
Updating FormationReactions involving a glycyl radical... : 0.0%| | 0/ 0 [00:00<?]
Estimating effective turnover rates for reactions using the SASA method... : 100.0%|██████████| 2882/ 2882 [00:00<00:00]
Mapping effective turnover rates from user input... : 0.0%| | 0/ 0 [00:00<?]
Setting the effective turnover rates using user input... : 100.0%|██████████| 2632/ 2632 [00:00<00:00]
Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 5054/ 5054 [00:06<00:00]
Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00<?]
Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 5647/ 5647 [00:01<00:00]
Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 4640/ 4640 [00:09<00:00]
Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 4540/ 4540 [00:05<00:00]
Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 3513/ 3513 [00:03<00:00]
Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 988/ 988 [00:00<00:00]
Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00<?]
Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1354/ 1354 [00:00<00:00]
Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1354/ 1354 [00:00<00:00]
Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1159/ 1159 [00:02<00:00]
Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 792/ 792 [00:01<00:00]
Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 965/ 965 [00:00<00:00]
Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00<?]
Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1352/ 1352 [00:00<00:00]
Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1352/ 1352 [00:00<00:00]
Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1157/ 1157 [00:01<00:00]
Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 790/ 790 [00:01<00:00]
Pruning unnecessary ComplexData reactions... : 100.0%|██████████| 964/ 964 [00:00<00:00]
Pruning unnecessary FoldedProtein reactions... : 0.0%| | 0/ 0 [00:00<?]
Pruning unnecessary ProcessedProtein reactions... : 100.0%|██████████| 1351/ 1351 [00:00<00:00]
Pruning unnecessary TranslatedGene reactions... : 100.0%|██████████| 1351/ 1351 [00:00<00:00]
Pruning unnecessary TranscribedGene reactions... : 100.0%|██████████| 1156/ 1156 [00:02<00:00]
Pruning unnecessary Transcriptional Units... : 100.0%|██████████| 789/ 789 [00:00<00:00]
ME-model was saved in the ./bsubtilis/ directory as MEModel-step2-bsubtilis.pkl
ME-model reconstruction is done.
Number of metabolites in the ME-model is 4630 (+364.39%, from 997)
Number of reactions in the ME-model is 7758 (+514.74%, from 1262)
Number of genes in the ME-model is 1154 (+36.73%, from 844)
Number of missing genes from reconstruction with homology, but no function is 104. Check the curation notes for more details.
5.6. Troubleshoot ME-model
This corresponds to Find gaps in Architecture of coralME
[6]:
builder.troubleshoot(growth_key_and_value = { builder.me_model.mu : 0.001 },solver="qminos")
The MINOS and quad MINOS solvers are a courtesy of Prof Michael A. Saunders. Please cite Ma, D., Yang, L., Fleming, R. et al. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression. Sci Rep 7, 40863 (2017). https://doi.org/10.1038/srep40863
~ Troubleshooting started...
Checking if the ME-model can simulate growth without gapfilling reactions...
Original ME-model is feasible with a tested growth rate of 0.001000 1/h
~ Final step. Fully optimizing with precision 1e-6 and save solution into the ME-model...
Gapfilled ME-model is feasible with growth rate 0.101279 (M-model: 0.117966).
ME-model was saved in the ./bsubtilis/ directory as MEModel-step3-bsubtilis-TS.pkl
Note: We set 0.001 as a standard value for feasibility checking, but feel free to modify it! Sometimes too high a value could put a significant strain on the model and give too many gaps to start with. Too low a value might not show you all the gaps needed.
5.7. Save as a JSON
[7]:
coralme.io.json.save_json_me_model(builder.me_model,"./bsubtilis/MEModel-step3-bsubtilis-TS.json")
2024-11-26 17:43:32,509 The metabolite 'pg160_p' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,510 The metabolite '2agpg160_p' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,510 The metabolite 'pe160_p' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,510 The metabolite '2agpe160_p' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,909 Metabolite '4fe4s_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,910 Metabolite '4fe4s_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,910 Metabolite '2fe2s_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,910 Metabolite '2fe2s_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,910 Metabolite 'hemed_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,911 Metabolite 'hemed_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,911 Metabolite 'fmnh2_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,911 Metabolite 'fmnh2_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,912 Metabolite 'tl_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,912 Metabolite 'tl_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,912 Metabolite 'cs_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,912 Metabolite 'cs_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,913 The metabolite 'adocbl_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,913 The metabolite 'adocbl_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,913 The metabolite 'coo_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,913 The metabolite 'coo_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,913 The metabolite 'cosh_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,914 The metabolite 'cosh_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,914 The metabolite 'lipo_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,914 The metabolite 'lipo_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,914 Metabolite 'lipoyl_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,914 Metabolite 'lipoyl_c' does not have a formula. If it is a 'Complex', its formula will be determined from amino acid composition and prosthetic groups stoichiometry. Otherwise, please add it to the M-model.
2024-11-26 17:43:32,915 The metabolite 'li_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,915 The metabolite 'li_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,915 The metabolite 'cl_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,915 The metabolite 'cl_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,915 The metabolite 'bmocogdp_c' must exist in the ME-model to calculate the element contribution.
2024-11-26 17:43:32,915 The metabolite 'bmocogdp_c' must exist in the ME-model to calculate the element contribution.