Local LLM Setup Guide

Using Google Gemma 3 4B (gemma3:4b) as an example: deploy Ollama and LM Studio on Mac, Windows, and Linux; pull the model; and allow LAN access.

1. Ollama

Ollama provides a CLI and local API. Default port is 11434.

1.1 Install

  • macOS: Download from ollama.com/download/mac; the app auto-updates.
  • Windows: Download OllamaSetup.exe from ollama.com/download, or run in PowerShell:
    irm https://ollama.com/install.ps1 | iex
    Requires Windows 10 or later.
  • Linux: Run in a terminal:
    curl -fsSL https://ollama.com/install.sh | sh

1.2 Pull a model (e.g. Gemma 3 4B)

In Ollama this model is named gemma3:4b (Hugging Face: google/gemma-3-4b).

ollama pull gemma3:4b

Then run a chat:

ollama run gemma3:4b

1.3 Allow LAN access

By default Ollama listens on 127.0.0.1. To allow other devices on your network, set OLLAMA_HOST=0.0.0.0.

  • One-off (current terminal):
    OLLAMA_HOST=0.0.0.0 ollama serve
    If the Ollama app is already running, quit it first, then run the command above in a terminal.
  • macOS (persistent): Edit Ollama’s launchd plist (e.g. under ~/Library/LaunchAgents/ or Homebrew’s plist) and add inside the <dict>:
    <key>EnvironmentVariables</key>
    <dict>
      <key>OLLAMA_HOST</key>
      <string>0.0.0.0</string>
    </dict>
    Restart Ollama after saving. Alternatively, skip editing the plist and run OLLAMA_HOST=0.0.0.0 ollama serve in a terminal when you need LAN access.
  • Windows: Add a user or system environment variable OLLAMA_HOST = 0.0.0.0, then restart the Ollama app/service.
  • Linux (systemd): Edit /etc/systemd/system/ollama.service (or equivalent), add under [Service]:
    Environment="OLLAMA_HOST=0.0.0.0"
    Then run:
    sudo systemctl daemon-reload
    sudo systemctl restart ollama

For browser or cross-origin clients you may also set OLLAMA_ORIGINS=* (recommended only on a trusted LAN).

Other devices on the LAN can then use http://<your-machine-IP>:11434 (e.g. http://192.168.1.100:11434).

2. LM Studio

LM Studio offers a GUI and an OpenAI-compatible local API. It includes the lms CLI for downloading models, loading them, and running the server from the terminal. See LM Studio CLI docs for the full reference.

2.1 Install

  • macOS: Download from lmstudio.ai/download (Apple Silicon only), or:
    curl -fsSL https://lmstudio.ai/install.sh | bash
  • Windows: Download the installer from the same page, or PowerShell:
    irm https://lmstudio.ai/install.ps1 | iex
  • Linux: Download the AppImage or use the install script from the official site.

16GB+ RAM is recommended; on Windows, 4GB+ dedicated VRAM is recommended. You must run LM Studio at least once before the lms CLI is available.

2.2 Download and load a model (e.g. Gemma 3 4B)

GUI: Open LM Studio, search for Gemma 3 4B or google/gemma-3-4b in the discovery view, choose a quantization (e.g. Q4_K_M), and download. Then load the model in the Local Server / Developer tab.

CLI: Use lms get to search and download models, lms ls to list models on disk, and lms load to load a model (e.g. with --gpu=max or --context-length=8192). Example:

lms get google/gemma-3-4b
lms load google/gemma-3-4b --identifier="gemma3-4b"

Start the server with lms server start; stop it with lms server stop. Custom port: lms server start --port 3000. For web or cross-origin clients, add --cors (use only on a trusted network).

2.3 Allow LAN access

  • GUI: In LM Studio’s server settings, enable “Serve on Local Network”. The server will bind to your machine’s LAN IP so other devices on the same network can reach it. See Serve on Local Network.
  • CLI: Bind to all interfaces so the server is reachable on the LAN:
    lms server start --bind 0.0.0.0
    Or set the environment variable LMS_SERVER_HOST=0.0.0.0 before starting the server.

Default port is usually 1234 (or the last used port). Use http://<your-machine-IP>:1234 as the API base URL from other devices on the LAN.

3. Summary

Item Ollama LM Studio
Example model gemma3:4b Gemma 3 4B (Hugging Face)
Pull / download ollama pull gemma3:4b GUI or lms get; load with lms load
Default port 11434 1234
LAN access OLLAMA_HOST=0.0.0.0 Enable “Serve on Local Network” or --bind 0.0.0.0

Related links

This page is a public LLM setup reference from the Privy product site, for use with PrivyPDF, PrivyFeed, PrivaTranslate, and other apps that use a local LLM.