Troubleshooting

Start with `doctor`

Before changing config by hand, run:

cd ~/walter-onprem
bin/walter-onprem doctor

doctor checks:

Docker and the Compose plugin
Docker daemon reachability
release manifest reachability
PHX_HOST, deployment mode, and image settings in .env.deploy
published host port settings for bundle-managed installs
bundle-managed files in the install directory
docker compose config -q

Check service status

cd ~/walter-onprem
bin/walter-onprem status
bin/walter-onprem logs

For raw Docker Compose access:

cd ~/walter-onprem
bin/walter-onprem compose ps
bin/walter-onprem compose logs -f setup walter

Common issues

Install stops with `unknown flag: --project-name`

That host has Docker, but Walter could not find a working Compose command. Walter now supports either docker compose ... or standalone docker-compose ....

On Debian, Ubuntu, and Raspberry Pi OS:

sudo apt-get update
sudo apt-get install -y docker-compose-plugin
docker compose version

If your environment already uses standalone Compose, verify that instead:

docker-compose version

After either command works, rerun the Walter installer.

The image pull is unauthorized

Validate the provided credentials explicitly:

echo "<provided-token>" | docker login ghcr.io -u <provided-username> --password-stdin
docker pull ghcr.io/matchylabs/walter:<supported-release>

Walter shows the migration-required page

Run the migration job for the current release, then restart Walter:

cd ~/walter-onprem
bin/walter-onprem compose run --rm setup
bin/walter-onprem compose up -d

The browser setup flow does not appear

With AUTH_MODE=password, first boot should land on the admin setup flow when no users exist. Check the walter container logs and confirm the app can reach PostgreSQL.

License or LLM fields were not preseeded

That is expected unless you passed them to bin/walter-onprem install. Walter can accept those values in the browser after the stack starts.

Agents cannot connect

Verify that:

PHX_HOST matches the hostname agents can reach
WALTER_HTTPS_PORT is set if internal-tls is published on a non-default HTTPS port
public-tls and internal-tls clients trust the certificate chain

Internal TLS works on the server but not from your browser

If curl -k https://localhost/ works on the Walter host but your browser cannot load the site from another machine, the usual cause is a bad PHX_HOST.

For internal-tls, Walter generates a certificate for PHX_HOST. That means:

clients must resolve PHX_HOST to the Walter server
clients must browse to that exact hostname or IP
if you publish internal-tls on a non-default HTTPS port, clients must use that exact host:port

Common mistake:

PHX_HOST=walter.internal.example in .env.deploy
no internal DNS or /etc/hosts entry exists for that name

Fix one of these:

change PHX_HOST to a real internal DNS name or the server's private IP
add a matching /etc/hosts entry on each client machine
if you intentionally publish internal-tls on a non-default port, set WALTER_HTTPS_PORT in .env.deploy

If you change PHX_HOST, rerun the upgrade with the new host so the bundle is refreshed and Walter regenerates the certificate:

cd ~/walter-onprem
bin/walter-onprem upgrade --host 192.168.1.50 --mode internal-tls

If you also changed the published HTTPS port:

cd ~/walter-onprem
bin/walter-onprem upgrade --host 192.168.1.50 --mode internal-tls --https-port 8443

I re-ran the installer and now I need my old local edits

The installer keeps the previous install directory as a timestamped backup next to ~/walter-onprem.

That backup is for:

recovering manual edits to bundle-managed files
comparing old and new compose layouts during upgrades
restoring values you intentionally kept outside .env.deploy

It is not merged automatically after the bundle refresh, because merging old helper scripts or Compose files back into the new bundle would reintroduce stale topology.

I want a completely clean reinstall

Stopping containers is not enough. A normal reinstall keeps the previous .env.deploy, moves the old bundle aside as ~/walter-onprem.bak.*, and leaves Docker named volumes in place.

This destroys all bundled Walter state, including:

the bundled PostgreSQL data
Walter bootstrap secrets and runtime state
the generated internal TLS CA and server certificates
public-tls Caddy state, if that mode was in use

To wipe the on-prem server install completely:

cd ~/walter-onprem
bin/walter-onprem compose down --volumes --remove-orphans
rm -rf ~/walter-onprem ~/walter-onprem.bak.*

To confirm nothing from the server bundle remains:

docker ps -a --filter label=com.docker.compose.project=walter-onprem
docker volume ls | grep '^walter-onprem_'
ls -d ~/walter-onprem*

If this machine also has walter-agent installed on it, that is separate from the on-prem server bundle. To remove the local agent too:

sudo systemctl disable --now walter-agent 2>/dev/null || true
sudo rm -f /etc/systemd/system/walter-agent.service
sudo systemctl daemon-reload
sudo rm -f /usr/local/bin/walter-agent
sudo rm -rf /opt/walter "$HOME/.walter"

After that, rerun the installer to get a clean first boot.

The built-in database is not reachable

Wait a few seconds for PostgreSQL to finish its first start, then re-run:

cd ~/walter-onprem
bin/walter-onprem compose ps
bin/walter-onprem compose logs -f db