Step-by-step setup
Quick requirements
Prepare:
- Windows 10 22H2 or newer, macOS Sonoma 14 or newer, or a supported Linux system
- Enough disk space for Ollama and downloaded models
- Internet access for the installer and first model download
- A terminal: PowerShell, Terminal, or a Linux shell
- Patience for the first model download, which can be several gigabytes
For a first test, choose a small model before trying larger 7B, 14B, or bigger models.
Step 1: Download Ollama from the official source
Use the official download page:
https://ollama.com/download
Do not download random repackaged installers from third-party sites. Ollama updates frequently, and the official installer is the safest way to get the current desktop app, CLI, and local service behavior for your operating system.
The official documentation currently covers Windows, macOS, Linux, Docker, API endpoints, model library pages, and troubleshooting. If an older blog post gives different commands, prefer the official docs.
Step 2: Install Ollama on Windows
On Windows:
- Open the official Ollama download page.
- Choose the Windows download.
- Run
OllamaSetup.exe. - Open PowerShell, Command Prompt, or Windows Terminal.
- Check that the command is available:
ollama --version
Ollama for Windows runs as a native application and makes the ollama command available in cmd, PowerShell, and other terminal apps after installation. The Windows docs state that the local API is served on http://localhost:11434.
If ollama is not recognized, close and reopen the terminal. If it still fails, restart Windows or check whether the installer was blocked by security software.
Step 3: Install Ollama on macOS
On macOS:
- Open the official Ollama download page.
- Download the macOS build.
- Mount the downloaded file.
- Drag Ollama into the Applications folder.
- Launch Ollama once so it can set up the command-line link.
- Open Terminal and check:
ollama --version
The current macOS documentation lists macOS Sonoma 14 or newer as the system requirement. Apple Silicon Macs get CPU and GPU support; x86 Macs are CPU-only in the current official system notes.
If the terminal cannot find ollama, open the Ollama app again and allow it to create the command-line link when prompted.
Step 4: Install Ollama on Linux
The current official Linux install command is:
curl -fsSL https://ollama.com/install.sh | sh
After it finishes, verify the command:
ollama --version
On many Linux systems, the installer configures Ollama as a service. If you need to start it manually, use:
ollama serve
For a service-based Linux install, you can also check:
sudo systemctl status ollama
Linux GPU behavior depends on your drivers and hardware. For Nvidia, check that your driver is current and that nvidia-smi works. For AMD, read the current ROCm and Ollama Linux notes before assuming acceleration will work.
Step 5: Run your first local model
Start with a small model. A practical first command is:
ollama run llama3.2:3b
This downloads the model if it is not already present, loads it, and opens an interactive chat in your terminal.
Ask a short test question:
Write one sentence explaining what Ollama does.
If that works, you have confirmed the main path: Ollama is installed, the model library is reachable, a model can be downloaded, and local inference works.
Other beginner-friendly first tests include:
ollama run gemma3:4b
ollama run qwen3:4b
Choose one. Do not download several large models before you know your computer can run them comfortably.
Step 6: List installed models
After running a model, list local models:
ollama list
You should see the model name, ID, size, and modified time. This confirms that the model is stored locally.
If the list is empty, the model download did not complete or you ran the command in an environment that is not using the same Ollama installation.
Step 7: Check the local API
Ollama exposes a local API after installation. The base URL is:
http://localhost:11434/api
On macOS or Linux, check the model list endpoint:
curl http://localhost:11434/api/tags
On Windows PowerShell:
Invoke-RestMethod http://localhost:11434/api/tags
You should see a JSON response listing installed models. If you want to test generation through the API, use a short prompt.
macOS or Linux:
curl http://localhost:11434/api/generate -d '{"model":"llama3.2:3b","prompt":"Say hello in one short sentence.","stream":false}'
Windows PowerShell:
Invoke-RestMethod -Uri "http://localhost:11434/api/generate" -Method Post -ContentType "application/json" -Body '{"model":"llama3.2:3b","prompt":"Say hello in one short sentence.","stream":false}'
If this works, Ollama is ready for local apps and scripts.
Step 8: Connect Ollama to a UI or app
Ollama by itself is enough for terminal chat and local API use. Add another tool only when you know what problem it solves.
Use Open WebUI if you want a browser interface, chat history, document uploads, user accounts, and admin settings. A common local pattern is:
- Ollama runs models on the host machine.
- Open WebUI runs as a Docker container or local app.
- Open WebUI points to the Ollama base URL.
If Open WebUI runs in Docker and Ollama runs on the host, localhost inside the container usually means the container itself, not the host machine. On Docker Desktop, the host URL is often:
http://host.docker.internal:11434
If Open WebUI runs directly on the same machine without Docker, use:
http://localhost:11434
Use LM Studio instead of Ollama when you want a GUI-first desktop app for searching, downloading, loading, and chatting with local models. Use Ollama when you want a simple local model runner and API foundation that other tools can connect to.
Verify it works
Run these checks before connecting more tools:
ollama --version
ollama list
curl http://localhost:11434/api/tags
On Windows PowerShell, replace the curl check with:
Invoke-RestMethod http://localhost:11434/api/tags
A working setup should show:
- The
ollamacommand responds. - At least one model appears in
ollama list. - A short prompt returns an answer.
- The local API returns JSON from
/api/tags. - Your computer stays responsive while the model answers.
If the model works in the terminal but a separate app cannot see it, the problem is usually the connection URL, container networking, or a missing model name, not the model itself.
Common problems
The ollama command is not found
Close and reopen the terminal first. On Windows and macOS, the command may not be available in a terminal that was already open during installation.
On macOS, launch the Ollama app once and allow it to create the command-line link. On Windows, confirm that the installer completed and that you are using a normal PowerShell or Command Prompt session.
The model download is slow
Model files can be large. A 2 GB to 5 GB model can still take time on a slow connection, and larger models can be much bigger. Keep the terminal open until the download finishes.
If a download fails, rerun the same ollama run command. Avoid switching to a larger model while troubleshooting the first download.
The model loads but answers slowly
Use a smaller model. Speed depends on RAM, VRAM, CPU, GPU acceleration, cooling, context length, and what else your computer is doing.
Try:
ollama run llama3.2:1b
or:
ollama run gemma3:1b
Small models are not as capable, but they are useful for confirming that the setup works.
Port 11434 is unavailable
Another process may be using the port, or Ollama may not be running. Check whether Ollama is active:
ollama list
On Linux:
sudo systemctl status ollama
If you changed OLLAMA_HOST, container settings, or service configuration, undo that change and test the default local setup before debugging remote access.
Open WebUI cannot see Ollama models
First check Ollama directly:
ollama list
curl http://localhost:11434/api/tags
If that works, check the Open WebUI connection setting. For Docker Desktop on Windows and macOS, the Ollama URL is often:
http://host.docker.internal:11434
For a direct local install on the same host, use:
http://localhost:11434
Also confirm that you have pulled at least one model. Open WebUI cannot show a local model that Ollama has not downloaded.
GPU acceleration does not seem to work
Start by confirming that CPU inference works with a small model. Then update GPU drivers and check the official Ollama hardware notes for your platform.
Do not assume every GPU will accelerate every model on every operating system. Windows, macOS, Linux, Nvidia, AMD, Apple Silicon, and CPU-only machines have different paths.
Next useful actions
After Ollama works locally, choose the next step based on your actual goal:
- For model choice, compare beginner models before downloading large files.
- For a browser UI, install Open WebUI and connect it to Ollama.
- For a GUI-first desktop app, compare LM Studio.
- For automation, test n8n or Dify only after a local model answers reliably.
- For coding tools, start with low-risk prompts before giving an assistant access to files or commands.
The best next step is usually not installing five more tools. It is making one local model useful for one real task.
Background, planning, and caveats
What Ollama does
Ollama is a local model runner. It downloads supported model packages, stores them on your machine, runs them locally, and exposes a local API that other tools can call.
That makes it a useful foundation for local AI experiments because one model service can support terminal chat, scripts, browser UIs, and automation tools.
What Ollama is not
Ollama is not a guarantee that every AI workflow is private. The model inference can be local, but a connected app may still use cloud APIs, web search, telemetry, remote document storage, external embeddings, or hosted model providers.
If privacy is the reason you are installing Ollama, test the whole workflow, not just the model runner.
Model licenses still matter
Ollama makes models easy to run, but each model has its own license, model card, and use restrictions. Before using a model for client work, commercial workflows, public content, or regulated data, check the model's license and acceptable-use terms.
Local hardware has limits
If a model barely fits in memory, it may load slowly, respond slowly, or make the computer unpleasant to use. Context length, document size, and multi-step tools can increase memory pressure beyond the simple model download size.
Start small, then step up only when you have a reason.
Security and scope checks
Before connecting Ollama to other tools, check:
- Is the API still bound to local access only?
- Are you exposing
11434to your network or the internet? - Does the connected app use only local models, or does it also call cloud providers?
- Are documents, prompts, or chat history stored somewhere outside your machine?
- Does the tool have permission to run commands, read files, or change local data?
- Does the model license allow your intended use?
For normal personal use, keep Ollama local. Do not expose the API publicly without authentication, firewall rules, and a clear operational reason.
Red flags and common errors
Be careful if you see any of these:
- A tutorial tells you to expose Ollama to the public internet for convenience.
- A Docker app points to
localhost:11434but runs in a separate container. - You download a huge model before testing a small one.
- You paste secrets into a tool before checking whether the full workflow is local.
- You assume "free local model" means "safe for all commercial use."
- You troubleshoot Open WebUI before confirming
ollama listand/api/tags. - You treat GPU acceleration as guaranteed without checking drivers and platform support.
Most setup problems become easier when you test one layer at a time: install Ollama, run one model, verify the API, then add a UI.
Questions to ask before building on Ollama
- Do you need a terminal tool, a desktop app, a browser UI, or an automation backend?
- What model size can your machine run comfortably?
- Do you need offline use after models are downloaded?
- Will the workflow handle sensitive documents or credentials?
- Do you need the best possible model quality, or is local control more important?
- Will more than one person use the system?
- How will you back up model settings, chat history, documents, or workflow files?
These questions matter more after the first test works. At the beginning, keep the goal simple: install Ollama, run one model, verify the API.
Bottom line
Install Ollama from the official download page, run a small first model such as llama3.2:3b, confirm it appears in ollama list, and verify the local API at http://localhost:11434/api/tags. Once that foundation works, add Open WebUI, LM Studio, n8n, Dify, or coding tools only for a specific workflow.