Skip to content

Setup

Unfamiliar with the terminal?

If you find yourself having trouble with any of the terminal use or package management in the guide below, we recommend checking out MIT's "Missing Semester of Your CS Education". The first three videos + the lecture on git and version control should get you up to speed.

ProteinGen was designed to facilitate the use of AI coding agents for writing design pipelines.

We recommend Claude Code (installation instructions below). There are several other great options, including: Pi, Amp, Codex. These tools typically require a paid plan with a model provider, but we've gotten plenty of mileage for our tasks out of the basic tiers.

Installing Claude Code
npm install -g @anthropic-ai/claude-code

This requires Node.js ≥ 18. If you don't have Node.js installed:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
nvm install node
brew install node

Then verify:

claude --version

See the Claude Code quickstart for authentication setup and first-run instructions.

Once you install the agent, the easiest way to complete the setup is just to copy the link to this guide

https://ishan-gaur.github.io/proteingen/setup/

and ask it to walk you through the setup. If your agent doesn't have internet access, just clone the repo

git clone https://github.com/ishan-gaur/proteingen.git
and point your model to proteingen/docs/setup.md.

If you'd like to continue on manually, we still include the full details to setup ProteinGen below.

2. (Optional) Install a Package Manager

We use uv for all dependency management and running scripts. Conda, Poetry, or good ol' venv are also popular options.

Tip

Any uv pip install ... command works as plain pip install ... from within a conda environment. That being said, uv is super cool, and we highly recommend giving it a try — here's a talk from the founder on why package management is an interesting problem.

Install uv according to your OS:

curl -LsSf https://astral.sh/uv/install.sh | sh
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
brew install uv

See the uv installation docs for additional options (pip, conda, Docker, etc.).

After installing, restart your terminal and verify:

uv --version

Create a project:

mkdir my-protein-project
cd my-protein-project
git init
uv init

We recommend Miniforge, which ships with mamba (a faster drop-in replacement for conda):

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Download and run the Miniforge installer.

After installing, restart your terminal and verify:

conda --version

Create a project:

mkdir my-protein-project
cd my-protein-project
git init
conda create -n my-protein-project python=3.12 -y
conda activate my-protein-project

3. Install ProteinGen

ProteinGen is designed to be a library you own: that you and your agents can adapt as you find what works for you. Accordingly, we recommend doing an editable local install, however you can also install it as a dependency directly from GitHub.

Requirements

Python 3.12 is required (>=3.12.0, <3.13). Check with python --version. If you're using uv, it will manage the Python version for you. For conda, specify python=3.12 when creating the environment.

GPU (CUDA) is recommended for most models. Smaller models like ProteinMPNN (~1.7M params) run fine on CPU.

# Clone ProteinGen alongside your project
git clone https://github.com/ishan-gaur/proteingen.git

# Add ProteinGen as an editable dependency of your project
cd my-protein-project
uv add --editable ../proteingen
# Ensure your Conda environment is active, can skip if using just pip
conda activate name-of-your-env

# Clone ProteinGen alongside your project
git clone https://github.com/ishan-gaur/proteingen.git

# Install
cd proteingen
pip install -e .

If you don't want to modify ProteinGen's source:

uv add "proteingen @ git+https://github.com/ishan-gaur/proteingen.git"
pip install "proteingen @ git+https://github.com/ishan-gaur/proteingen.git"

ProteinMPNN model assets (required)

ProteinMPNN support is installed by default with ProteinGen, but you still need to download the ProteinMPNN assets once:

uv run foundry install proteinmpnn
foundry install proteinmpnn
Optional dependencies (expand if needed)

Some dependencies are only available via GitHub (not PyPI) and must be installed separately:

mkdocs-liveedit — live-reloading plugin for the docs site. Required for mkdocs serve:

uv pip install "mkdocs-liveedit @ git+https://github.com/ishan-gaur/mkdocs-liveedit.git"
pip install "mkdocs-liveedit @ git+https://github.com/ishan-gaur/mkdocs-liveedit.git"

af3-server — Python client for a persistent AlphaFold 3 inference server. Required for structure prediction workflows (e.g. Benchmark — Model Families, Fine-tuning Inverse Folding):

uv pip install "af3-server @ git+https://github.com/ishan-gaur/af3-server.git"
pip install "af3-server @ git+https://github.com/ishan-gaur/af3-server.git"

See the af3-server README for server setup instructions.

Verify installation

uv run python -c "import proteingen; print('ProteinGen installed successfully')"
python -c "import proteingen; print('ProteinGen installed successfully')"

4. Configure Your Agent to Use ProteinGen

Copy this into your project's AGENTS.md or CLAUDE.md (whatever your agent reads). It instructs your agent to find ProteinGen and use its documentation.

## ProteinGen

Upon your first use of proteingen, complete the following tasks. First, find the path of the installed dependency and add it below. Second, find the .agents/skills/ directory in the proteingen folder and copy that into your main skills folder. The correct location will depend on your agent harness, so look it up or check with your user if unsure.

- Library location: {FILL THIS IN}
- When using ProteinGen, read `../proteingen/AGENTS.md` for working knowledge about the library's internals, gotchas, and design decisions. Make sure to recursively follow the AGENTS.md to the appropriate markdown file discussing the feature of the library you need to use for your task
- Documentation: `../proteingen/docs/` or https://ishan-gaur.github.io/proteingen/

The above text will instruct your model to look at markdown files we've included throughout the codebase that accumulate fixes to errors and gotchas we've found when using agents with ProteinGen over time. It also installs the agent skills ProteinGen ships with. Skills are step-by-step workflows your coding agent can follow for common tasks (e.g. adding a new generative model). See the available skills for a full listing.

5. Next Steps

You're all set! Here's where to go from here:

  • Design Philosophy — understand the three base classes (ProbabilityModel, GenerativeModel, PredictiveModel) and how they compose
  • Examples — working end-to-end code for sampling, fine-tuning, and guided generation
  • Workflows — step-by-step recipes for common protein design tasks
  • Models — all supported models, their capabilities, and code examples