Update README.md

Create an initial update to README.md for Mamba environment
This commit is contained in:
Alex Morehead 2024-03-09 19:09:13 -06:00 committed by GitHub
parent 6bc5c745a2
commit ea81b735aa
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -20,21 +20,29 @@ RFAA is not accurate for all cases, but produces useful error estimates to allow
<a id="set-up"></a> <a id="set-up"></a>
### Setup/Installation ### Setup/Installation
1. Clone the package 1. Install Mamba
```
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh # accept all terms and install to the default location
rm Mambaforge-$(uname)-$(uname -m).sh # (optionally) remove installer after using it
source ~/.bashrc # alternatively, one can restart their shell session to achieve the same result
```
2. Clone the package
``` ```
git clone https://github.com/baker-laboratory/RoseTTAFold-All-Atom git clone https://github.com/baker-laboratory/RoseTTAFold-All-Atom
cd RoseTTAFold-All-Atom cd RoseTTAFold-All-Atom
``` ```
2. Download the container used to run RFAA. 3. Create Mamba environment
``` ```
wget http://files.ipd.uw.edu/pub/RF-All-Atom/containers/SE3nv-20240131.sif mamba env create -f environment.yaml
conda activate RFAA # NOTE: one still needs to use `conda` to (de)activate environments
pip3 install -e .
``` ```
3. Download the model weights. 4. Download the model weights.
``` ```
wget http://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFAA_paper_weights.pt wget http://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFAA_paper_weights.pt
``` ```
4. Download sequence databases for MSA and template generation. 5. Download sequence databases for MSA and template generation.
``` ```
# uniref30 [46G] # uniref30 [46G]
wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz wget http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz
@ -56,11 +64,9 @@ tar xfz pdb100_2021Mar03.tar.gz
We use a library called Hydra to compose config files for predictions. The actual script that runs the model is in `rf2aa/run_inference.py` and default parameters that were used to train the model are in `rf2aa/config/inference/base.yaml`. We highly suggest using the default parameters since those are closest to the training task for RFAA but we have found that increasing loader_params.MAXCYCLE=10 (default set to 4) gives better results for hard cases (as noted in the paper). We use a library called Hydra to compose config files for predictions. The actual script that runs the model is in `rf2aa/run_inference.py` and default parameters that were used to train the model are in `rf2aa/config/inference/base.yaml`. We highly suggest using the default parameters since those are closest to the training task for RFAA but we have found that increasing loader_params.MAXCYCLE=10 (default set to 4) gives better results for hard cases (as noted in the paper).
We use a container system called apptainers which have very simple syntax. Instead of developing a local conda environment, users can use the apptainer to run the model which has all the dependencies already packaged.
The general way to run the model is as follows: The general way to run the model is as follows:
``` ```
SE3nv-20240131.sif -m rf2aa.run_inference --config-name {your inference config} python -m rf2aa.run_inference --config-name {your inference config}
``` ```
The main inputs into the model are split into: The main inputs into the model are split into:
- protein inputs (protein_inputs) - protein inputs (protein_inputs)
@ -90,7 +96,7 @@ When specifying the fasta file for your protein, you might notice that it is nes
Now to predict the sample monomer structure, run: Now to predict the sample monomer structure, run:
``` ```
SE3nv-20240131.sif -m rf2aa.run_inference --config-name protein python -m rf2aa.run_inference --config-name protein
``` ```
<a id="p-na-complex"></a> <a id="p-na-complex"></a>
@ -118,7 +124,7 @@ This repo currently does not support making RNA MSAs or pairing protein MSAs wit
Now, predict the example protein/NA complex. Now, predict the example protein/NA complex.
``` ```
SE3nv-20240131.sif -m rf2aa.run_inference --config-name nucleic_acid python -m rf2aa.run_inference --config-name nucleic_acid
``` ```
<a id="p-sm-complex"></a> <a id="p-sm-complex"></a>
### Predicting Protein Small Molecule Complexes ### Predicting Protein Small Molecule Complexes
@ -143,7 +149,7 @@ Small molecule inputs are provided as sdf files or smiles strings and users are
To predict the example: To predict the example:
``` ```
SE3nv-20240131.sif -m rf2aa.run_inference --config-name protein_sm python -m rf2aa.run_inference --config-name protein_sm
``` ```
<a id="higher-order"></a> <a id="higher-order"></a>
### Predicting Higher Order Complexes ### Predicting Higher Order Complexes
@ -172,7 +178,7 @@ sm_inputs:
``` ```
And to run: And to run:
``` ```
SE3nv-20240131.sif -m rf2aa.run_inference --config-name protein_na_sm python -m rf2aa.run_inference --config-name protein_na_sm
``` ```
<a id="covale"></a> <a id="covale"></a>
### Predicting Covalently Modified Proteins ### Predicting Covalently Modified Proteins