Skip to content

Troubleshooting

Start with doctor

Before changing config by hand, run:

cd ~/walter-onprem
bin/walter-onprem doctor

doctor checks:

  • Docker and the Compose plugin
  • Docker daemon reachability
  • release manifest reachability
  • PHX_HOST, deployment mode, and image settings in .env.deploy
  • published host port settings for bundle-managed installs
  • bundle-managed files in the install directory
  • docker compose config -q

Check service status

cd ~/walter-onprem
bin/walter-onprem status
bin/walter-onprem logs

For raw Docker Compose access:

cd ~/walter-onprem
bin/walter-onprem compose ps
bin/walter-onprem compose logs -f setup walter

Common issues

Install stops with unknown flag: --project-name

That host has Docker, but Walter could not find a working Compose command. Walter now supports either docker compose ... or standalone docker-compose ....

On Debian, Ubuntu, and Raspberry Pi OS:

sudo apt-get update
sudo apt-get install -y docker-compose-plugin
docker compose version

If your environment already uses standalone Compose, verify that instead:

docker-compose version

After either command works, rerun the Walter installer.

The image pull is unauthorized

Validate the provided credentials explicitly:

echo "<provided-token>" | docker login ghcr.io -u <provided-username> --password-stdin
docker pull ghcr.io/matchylabs/walter:<supported-release>

Walter shows the migration-required page

Run the migration job for the current release, then restart Walter:

cd ~/walter-onprem
bin/walter-onprem compose run --rm setup
bin/walter-onprem compose up -d

The browser setup flow does not appear

With AUTH_MODE=password, first boot should land on the admin setup flow when no users exist. Check the walter container logs and confirm the app can reach PostgreSQL.

License or LLM fields were not preseeded

That is expected unless you passed them to bin/walter-onprem install. Walter can accept those values in the browser after the stack starts.

Agents cannot connect

Verify that:

  • PHX_HOST matches the hostname agents can reach
  • WALTER_HTTPS_PORT is set if internal-tls is published on a non-default HTTPS port
  • public-tls and internal-tls clients trust the certificate chain

Internal TLS works on the server but not from your browser

If curl -k https://localhost/ works on the Walter host but your browser cannot load the site from another machine, the usual cause is a bad PHX_HOST.

For internal-tls, Walter generates a certificate for PHX_HOST. That means:

  • clients must resolve PHX_HOST to the Walter server
  • clients must browse to that exact hostname or IP
  • if you publish internal-tls on a non-default HTTPS port, clients must use that exact host:port

Common mistake:

  • PHX_HOST=walter.internal.example in .env.deploy
  • no internal DNS or /etc/hosts entry exists for that name

Fix one of these:

  • change PHX_HOST to a real internal DNS name or the server's private IP
  • add a matching /etc/hosts entry on each client machine
  • if you intentionally publish internal-tls on a non-default port, set WALTER_HTTPS_PORT in .env.deploy

If you change PHX_HOST, rerun the upgrade with the new host so the bundle is refreshed and Walter regenerates the certificate:

cd ~/walter-onprem
bin/walter-onprem upgrade --host 192.168.1.50 --mode internal-tls

If you also changed the published HTTPS port:

cd ~/walter-onprem
bin/walter-onprem upgrade --host 192.168.1.50 --mode internal-tls --https-port 8443

I re-ran the installer and now I need my old local edits

The installer keeps the previous install directory as a timestamped backup next to ~/walter-onprem.

That backup is for:

  • recovering manual edits to bundle-managed files
  • comparing old and new compose layouts during upgrades
  • restoring values you intentionally kept outside .env.deploy

It is not merged automatically after the bundle refresh, because merging old helper scripts or Compose files back into the new bundle would reintroduce stale topology.

I want a completely clean reinstall

Stopping containers is not enough. A normal reinstall keeps the previous .env.deploy, moves the old bundle aside as ~/walter-onprem.bak.*, and leaves Docker named volumes in place.

This destroys all bundled Walter state, including:

  • the bundled PostgreSQL data
  • Walter bootstrap secrets and runtime state
  • the generated internal TLS CA and server certificates
  • public-tls Caddy state, if that mode was in use

To wipe the on-prem server install completely:

cd ~/walter-onprem
bin/walter-onprem compose down --volumes --remove-orphans
rm -rf ~/walter-onprem ~/walter-onprem.bak.*

To confirm nothing from the server bundle remains:

docker ps -a --filter label=com.docker.compose.project=walter-onprem
docker volume ls | grep '^walter-onprem_'
ls -d ~/walter-onprem*

If this machine also has walter-agent installed on it, that is separate from the on-prem server bundle. To remove the local agent too:

sudo systemctl disable --now walter-agent 2>/dev/null || true
sudo rm -f /etc/systemd/system/walter-agent.service
sudo systemctl daemon-reload
sudo rm -f /usr/local/bin/walter-agent
sudo rm -rf /opt/walter "$HOME/.walter"

After that, rerun the installer to get a clean first boot.

The built-in database is not reachable

Wait a few seconds for PostgreSQL to finish its first start, then re-run:

cd ~/walter-onprem
bin/walter-onprem compose ps
bin/walter-onprem compose logs -f db