Is Letta the same as MemGPT?

Letta is the successor project to MemGPT. Current setup guides should use Letta commands and Letta documentation rather than older MemGPT instructions.

What port does the local Letta server use?

The standard Docker example maps the server to port `8283`, with API requests under `http://localhost:8283/v1`.

Can Letta use Ollama?

Yes. For Docker Desktop, set `OLLAMA_BASE_URL` to `http://host.docker.internal:11434/v1`. For Linux host networking, use `http://localhost:11434/v1`.

Does Letta need an embedding model?

For Docker-hosted agents, specify an embedding model when creating agents. Embeddings power archival memory search and retrieval.

Can I run Letta fully offline?

You can run the server and local model providers offline after images, packages, and models are already available. Any hosted model provider or online tool still requires network access.

How to Install Letta Locally for AI Agent Memory

Tool overview

Letta is the agent framework that grew out of MemGPT. Its main idea is persistent agent memory: instead of treating every conversation as a disposable chat, Letta agents keep structured state, memory blocks, archival memory, tools, and message history across sessions.

That makes Letta useful for assistants, research agents, coding agents, support agents, and long-running workflows where memory is part of the product rather than an afterthought.

Local setup choices

For most local users, there are three paths:

Docker server: best for a local API server with persistent database storage.
Letta Desktop: easiest visual path when your platform is supported.
Python or SDK work: best when you are building against the Letta API from your own app.

This guide focuses on the Docker server because it is the cleanest local setup for developers who want persistence, API access, and Ollama integration.

Step-by-step setup

Quick requirements

Prepare:

Docker
Ollama if you want local models
A capable local model already pulled in Ollama
Python if you plan to use the Python SDK
Enough RAM for both the Letta server and your model runner

Check Ollama:

ollama list
curl http://localhost:11434/v1/models

Letta can use hosted model providers too, but a local setup is simpler to reason about when the Letta server and model endpoint are both under your control.

Step 1: Start the Letta server

Run:

docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  letta/letta:latest

The volume mapping is important. It keeps the server database outside the container so your agents and memory survive container restarts.

Open the API base:

http://localhost:8283/v1

You can also connect the Agent Development Environment to the local server when you want a visual way to inspect agents and memory.

Step 2: Add Ollama as a model provider

If Letta runs in Docker and Ollama runs on the same Windows or macOS host, use:

docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  -e OLLAMA_BASE_URL="http://host.docker.internal:11434/v1" \
  letta/letta:latest

For Linux:

docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  --network host \
  -e OLLAMA_BASE_URL="http://localhost:11434/v1" \
  letta/letta:latest

Letta expects the Ollama OpenAI-compatible endpoint for this integration, so include /v1.

Step 3: Configure embeddings

Letta uses embeddings for archival memory search and retrieval. When creating agents against a Docker server, specify an embedding model explicitly.

Install the Python client in your project environment:

pip install letta-client

Then create a client:

from letta_client import Letta

client = Letta(base_url="http://localhost:8283")

A basic agent creation call needs both a model and an embedding model:

agent = client.agents.create(
    model="openai/gpt-4o-mini",
    embedding="openai/text-embedding-3-small",
)

For a local-only setup, choose model and embedding handles that your Letta server can actually reach through the providers you configured. Do not assume a chat model can also serve as a good embedding model.

Step 4: Create a memory agent

A Letta agent usually has memory blocks, tools, and a model configuration. A minimal useful pattern is:

from letta_client import Letta

client = Letta(base_url="http://localhost:8283")

agent = client.agents.create(
    model="openai/gpt-4o-mini",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {
            "label": "human",
            "value": "The user is testing a local Letta setup.",
        },
        {
            "label": "persona",
            "value": "I am a careful local assistant with persistent memory.",
        },
    ],
)

response = client.agents.messages.create(
    agent.id,
    messages=[
        {
            "role": "user",
            "text": "Remember that this server is running locally.",
        }
    ],
)

print(response)

If this fails, first check that the server is reachable, then check model provider access, then check embedding configuration.

Step 5: Add password protection

If the server is only on your own laptop, you may start without password protection during a short test. For anything shared, networked, or long-running, add it from the start:

docker run \
  -v ~/.letta/.persist/pgdata:/var/lib/postgresql/data \
  -p 8283:8283 \
  -e SECURE=true \
  -e LETTA_SERVER_PASSWORD="change-this-password" \
  letta/letta:latest

Then clients must authenticate:

from letta_client import Letta

client = Letta(
    base_url="http://localhost:8283",
    api_key="change-this-password",
)

Use a real secret for any persistent installation. Do not expose a Letta server without access control.

Step 6: Decide whether local models are enough

Letta can run with local models, but memory-heavy agents are demanding. If the model is too small, too slow, or poor at following structured tool calls, the agent may loop, forget instructions, or produce unreliable tool arguments.

For local testing:

Start with one agent.
Keep the context and task small.
Use a stronger local model when possible.
Avoid complex tool chains until basic memory works.
Test embedding retrieval separately from chat quality.

If you are evaluating Letta Code or advanced computer-use workflows, start with a high-quality model first so you understand the intended behavior before switching to open-weight local models.

Persistence and backups

The most important file path in the basic Docker setup is:

~/.letta/.persist/pgdata

That is where the mapped Postgres data lives in the common local command. Back it up before changing versions, moving machines, or deleting containers.

Avoid:

docker rm -f <container>
rm -rf ~/.letta/.persist/pgdata

The first command removes a container. The second removes the actual persisted data path if you used the default example. Only do that when you intentionally want to wipe the local server.

Common problems

The server does not respond on port 8283

Check the container:

docker ps
docker logs <container-name-or-id>

If another service uses port 8283, map Letta to a different host port:

docker run -p 8290:8283 letta/letta:latest

Then connect to http://localhost:8290.

Letta cannot reach Ollama

Use http://host.docker.internal:11434/v1 on Docker Desktop. Use host networking and http://localhost:11434/v1 on Linux. Confirm Ollama responds before starting Letta:

curl http://localhost:11434/v1/models

Agent creation fails

Check that the model and embedding providers are both configured. Letta's chat model and embedding model are separate concerns.

Memory seems unreliable

Use a stronger model, reduce task complexity, and inspect memory blocks. Local models vary widely in how well they follow structured agent instructions.

Background, planning, and caveats

Approximate planning cost (U.S.)

A local Letta setup is practical on a developer machine, but persistent agents mean operations costs appear quickly:

Hardware: local models plus PostgreSQL and embeddings increase memory and CPU demand.
Storage: ~/.letta/.persist/pgdata stores Postgres state and grows with agent conversations.
Optional VPS/cloud: useful for 24/7 access and team tests, usually with monthly recurring expense.
API/provider costs: hosted providers for tool calls or advanced models can dominate runtime cost.
Optional managed hosting: Letta cloud options remove container/db management but change cost structure to service fees.

Cost breakdown

one-time local setup:
- machine/storage for the mapped data path,
- backups,
- optional secrets manager or local security tooling.
recurring local setup:
- model/API provider usage,
- infrastructure monitoring and upgrades.

Data persistence and backups

The key local state is in:

~/.letta/.persist/pgdata

Use these habits:

back up before updates,
avoid deleting mapped data while a known-good server is running,
export important agents and test configurations before major migration.

Credential handling

keep API keys in .env or secure environment variables, not hardcoded in images or scripts,
enable SECURE=true and a real LETTA_SERVER_PASSWORD for non-local use,
rotate model/provider credentials when environment scope changes.

Security risk and operational cautions

In production-like use, a network-exposed Letta server needs authentication and access control.
Tool usage can create side effects; set provider scope and limits to reduce blast radius.
The Letta docs explicitly recommend sandboxing and secure startup posture for non-trusted user scenarios.

Questions before installing

Does the team need persistent memory across sessions or short-lived demos?
Are you prepared for embedded database maintenance and restore testing?
Do you have a model/provider strategy for both chat and embeddings from day one?

Red flags

No rollback snapshot before version upgrades.
Exposing the API before password protection and key policy are set.
Assuming local tests with small contexts scale linearly to large multi-agent runs.

Rollback and update guidance

For rollback:

restore from a backup that includes pgdata and your env configuration,
test restored API and one representative agent before enabling all integrations.

For update:

update the image and restart with a controlled downtime window,
verify providers and embedded memory behavior before returning traffic.

Bottom line

For a local Letta setup, run the Docker server on port 8283, keep the Postgres volume mapped, connect Ollama through the OpenAI-compatible /v1 endpoint, and specify embeddings when creating agents. Treat the server as a stateful database-backed agent system, not a disposable chat container.

How to Install Letta Locally

Tool overview

Local setup choices

Step-by-step setup

Step 1: Start the Letta server

Step 2: Add Ollama as a model provider

Step 3: Configure embeddings

Step 4: Create a memory agent

Step 5: Add password protection

Step 6: Decide whether local models are enough

Persistence and backups

Common problems

The server does not respond on port 8283

Letta cannot reach Ollama

Agent creation fails

Memory seems unreliable

Background, planning, and caveats

Approximate planning cost (U.S.)

Cost breakdown

Data persistence and backups

Credential handling

Security risk and operational cautions

Questions before installing

Red flags

Rollback and update guidance

Bottom line

FAQ

Choose where to go from here

Find related resources

Browse all calculators

Tool overview

Local setup choices

Step-by-step setup

Step 1: Start the Letta server

Step 2: Add Ollama as a model provider

Step 3: Configure embeddings

Step 4: Create a memory agent

Step 5: Add password protection

Step 6: Decide whether local models are enough

Persistence and backups

Common problems

The server does not respond on port 8283

Letta cannot reach Ollama

Agent creation fails

Memory seems unreliable

Background, planning, and caveats

Approximate planning cost (U.S.)

Cost breakdown

Data persistence and backups

Credential handling

Security risk and operational cautions

Questions before installing

Red flags

Rollback and update guidance

Bottom line

Recommended next reads

How to Install Cline in VS Code for AI Workflows

How to Install LM Studio Locally for Local AI Chat

How to Install n8n Locally for Workflow Automation

Best LM Studio Alternatives for Local AI Tools

FAQ

Choose where to go from here

Find related resources

Browse all calculators