Troubleshooting Guide¶
Common issues and their solutions.
Table of Contents¶
- Git and Update Issues
- Update fails with "local changes would be overwritten"
- Recovering from failed updates
- Breaking changes after update
- Switching branches
- Server-Side Issues
- Services won't start
- Certificate issues
- Admin dashboard not accessible
- sing-box crashes
- WireGuard connected but no traffic
- Hysteria2 not working
- TrustTunnel not connecting
- AmneziaWG not connecting
- CDN VLESS+WS not working
- DNS tunnel not working
- Registry/Build Issues
- Container registry blocked (gcr.io, ghcr.io)
- Building images locally
- Monitoring Issues
- System hangs after starting monitoring
- Grafana shows "No Data"
- Clash-exporter authentication error (401)
- High memory usage from cAdvisor
- Snowflake metrics showing zeros
- WireGuard exporter not starting
- MoaV Test/Client Issues
- Client-Side Issues
- Network-Specific Issues
- Highly Censored Environments
- Reset and Re-bootstrap
- Common Commands
- Getting Help
Git and Update Issues¶
Update fails with "local changes would be overwritten"¶
When running moav update or the installer, you may see:
error: Your local changes to the following files would be overwritten by merge:
scripts/client-test.sh
Please commit your changes or stash them before you merge.
Aborting
Why this happens: - You edited files while testing a fix or feature - You manually modified configuration scripts - You tested a development branch and switched back
Solution 1: Use the interactive prompt (recommended)
The latest MoaV versions detect this and offer options:
moav update
# Will show:
# ⚠ Local changes detected:
# M scripts/client-test.sh
# Options:
# 1) Stash changes (save temporarily, can restore later)
# 2) Discard changes (reset to clean state)
# 3) Abort
Choose option 1 to save your changes, or option 2 to discard them.
Solution 2: Manual stash
cd /opt/moav
# Save your changes temporarily
git stash
# Now update
moav update
# or: git pull
# Restore your changes (may cause conflicts)
git stash pop
Solution 3: Discard changes
If you don't need your local changes:
cd /opt/moav
# Discard all local modifications
git checkout -- .
# Remove untracked files
git clean -fd
# Now update
moav update
Recovering from failed updates¶
If an update fails partway through:
cd /opt/moav
# Check current state
git status
# If there are merge conflicts
git merge --abort
# Reset to last known good state
git reset --hard HEAD
# Try updating again
moav update
If you need to completely reset:
cd /opt/moav
# Fetch latest from remote
git fetch origin
# Hard reset to remote main
git reset --hard origin/main
# Verify
git status
Breaking changes after update¶
Some updates include breaking changes (marked in CHANGELOG) that require regenerating configs. Symptoms include:
- Clients can't connect after update
- Services crash on startup
- Protocol-specific errors (e.g., "invalid obfuscation password")
Option 1: Rebuild configs (keeps users)
This regenerates server config while preserving user credentials. You must redistribute new config bundles to all users.
Option 2: Fresh start (new keys, new users)
If Option 1 doesn't work or you want a clean slate:
# Complete wipe and fresh install
moav uninstall --wipe
# Reconfigure
cp .env.example .env
nano .env # Set DOMAIN, ACME_EMAIL, ADMIN_PASSWORD
# Bootstrap fresh
./moav.sh
After any breaking change update:
1. Download new user bundles from admin dashboard or outputs/bundles/
2. Distribute to all users
3. Users must delete old configs and import new ones
Switching branches¶
Switch to a feature/test branch:
cd /opt/moav
git fetch origin
git checkout feature-branch-name
git pull
moav build # Rebuild containers if needed
Switch back to stable (main):
If switching fails due to local changes:
# Stash changes first
git stash
git checkout main
git pull
# Optionally restore changes
git stash pop
Common scenarios¶
Testing a bug fix from GitHub:
# Save current state
cd /opt/moav
git stash
# Get the fix
git fetch origin
git checkout fix-branch-name
moav build
moav restart
# After testing, return to main
git checkout main
git stash pop # Restore your changes if needed
Accidentally edited files:
# See what changed
git diff
# If you want to keep changes, stash them
git stash
# If you want to discard
git checkout -- filename.sh
# Or discard all changes
git checkout -- .
View stashed changes:
# List all stashes
git stash list
# Show what's in the most recent stash
git stash show -p
# Apply a specific stash
git stash apply stash@{0}
# Delete a stash
git stash drop stash@{0}
Server-Side Issues¶
Services won't start¶
Check logs:
Common causes:
-
Certificate not obtained:
-
Port already in use:
-
Port 53 already in use (for dnstt):
This is usually caused by systemd-resolved:
# Check what's using port 53
ss -ulnp | grep 53
# Stop and disable systemd-resolved
systemctl stop systemd-resolved
systemctl disable systemd-resolved
# Set up direct DNS resolution
echo -e "nameserver 1.1.1.1\nnameserver 8.8.8.8" > /etc/resolv.conf
-
Configuration error:
-
Docker network error ("network not found"):
This happens when Docker networks get corrupted from failed runs:
# Stop all containers and remove networks
docker compose down
docker network prune -f
# Start fresh
docker compose --profile all up -d
- Only some images built:
All services require --profile to be specified:
# Build ALL images including optional services
docker compose --profile all build --no-cache
# Build only proxy services
docker compose --profile proxy build
# Available profiles: proxy, wireguard, dnstt, trusttunnel, admin, conduit, snowflake, monitoring, all
- Port already in use (8443 for Trojan):
Change the Trojan port in your .env file:
Certificate issues¶
Certificate not renewing:
# Manual renewal
docker compose run --rm certbot renew
# Check certificate expiry
docker compose exec sing-box openssl x509 -enddate -noout -in /certs/live/*/fullchain.pem
Certificate acquisition failed: - Ensure DNS A record points to this server - Ensure port 80 is open (temporarily) - Check rate limits: https://letsencrypt.org/docs/rate-limits/
Admin dashboard not accessible¶
Check if container is running:
Verify port is listening:
Test locally first:
Open firewall:
Browser shows security warning (domain-less mode):
In domain-less mode, admin uses a self-signed certificate. This is expected: 1. Click "Advanced" or "Show Details" 2. Click "Proceed to site" or "Accept the Risk"
Access URLs:
- With domain: https://yourdomain.com:9443/
- Domain-less mode: https://YOUR_SERVER_IP:9443/
Admin runs on port 9443 by default (not 8443). The internal container port is 8443, but it's mapped to 9443 externally.
sing-box crashes¶
Check the logs:
Common fixes:
# Rebuild container
docker compose build --no-cache sing-box
docker compose up -d sing-box
# Reset configuration
docker compose --profile setup run --rm bootstrap
WireGuard handshake timeout¶
If you see:
This means UDP packets aren't reaching the server. Common causes:
- UDP port 51820 blocked - Most common in restrictive networks
-
Try WireGuard-wstunnel mode instead (tunnels over TCP/WebSocket)
-
Server firewall:
-
Server WireGuard not running:
WireGuard-wstunnel not connecting¶
If you see errors like:
-
Open port 8080 on server firewall:
-
Check wstunnel is running:
-
Check wstunnel logs:
-
Rebuild after update (if you updated MoaV):
Hysteria2 not working¶
Hysteria2 uses UDP port 443. If it's not working but Reality/Trojan work:
- UDP is likely blocked by your network - this is common in restrictive environments
- Hysteria2 is designed for networks where TCP is throttled but UDP works
- Try other protocols - Reality and Trojan use TCP and are more likely to work
Verify server-side:
# Check Hysteria2 is listening
docker compose logs sing-box | grep -i hysteria
# Test UDP connectivity (from another machine)
nc -vuz YOUR_SERVER_IP 443
TrustTunnel not connecting¶
Check container is running:
Common issues:
-
Port not open:
-
Certificate issue:
- TrustTunnel uses the same Let's Encrypt certificate as other services
-
If cert is missing, run
moav bootstrapagain -
Client config error:
- Verify credentials match
trusttunnel.txtin user bundle - Check
trusttunnel.tomlhas correct domain/IP
AmneziaWG not connecting¶
Check container is running:
Common issues:
-
Port not open:
-
Config mismatch:
- Obfuscation parameters (S1, S2, H1-H4) must match between server and client
-
Re-download the user bundle if parameters are wrong
-
awg-quick not found (client):
- Install awg-tools from https://github.com/amnezia-vpn/amneziawg-tools/releases
- Or use the Amnezia VPN app which includes built-in support
CDN VLESS+WS not working¶
DNS lookup failure:
If you see lookup cdn.yourdomain.com: operation was canceled:
1. Verify cdn subdomain exists in Cloudflare DNS
2. Check it's set to Proxied (orange cloud)
3. Wait for DNS propagation (up to 5 minutes)
Connection refused:
1. Verify port 2082 is open: ufw allow 2082/tcp
2. Check sing-box is listening: docker compose logs sing-box | grep vless-ws
Cloudflare 521 "Web server is down":
This usually means Cloudflare can't reach your origin on the correct port.
- Check Origin Rule exists (most common cause):
- Go to Cloudflare → Rules → Origin Rules
- You need a rule that redirects
cdn.yourdomain.comto port 2082 - Without this, Cloudflare connects to port 80 (wrong port)
-
See DNS.md Cloudflare section for setup instructions
-
Verify port 2082 is reachable:
-
Check firewall:
-
Verify sing-box is listening:
Cloudflare 525 "SSL Handshake Failed":
This means Cloudflare is trying HTTPS to your origin, but MoaV's CDN inbound on port 2082 is plain HTTP.
- Set SSL/TLS mode to Flexible in Cloudflare dashboard:
- Go to SSL/TLS → Overview → Set to Flexible
- Full and Full (Strict) will NOT work — they make Cloudflare connect via HTTPS, but port 2082 doesn't speak TLS
- If you need Full SSL for other subdomains, create a Configuration Rule:
- Go to Rules → Configuration Rules → Create rule
- Match: Hostname equals
cdn.yourdomain.com - Setting: SSL → Flexible
- This overrides the zone-wide SSL mode for just the CDN subdomain
Cloudflare 520 "Unknown error":
1. Set SSL/TLS mode to Flexible in Cloudflare dashboard (see 525 section above)
2. Verify sing-box container is running
3. Check sing-box config has vless-ws-in inbound on port 2082
WireGuard connected but no traffic¶
Check if peer is loaded:
Look for your peer's public key. It should show:
- latest handshake: X seconds ago
- transfer: X received, X sent
If there's no handshake, check for key mismatch:
# What the client config expects (server public key)
cat configs/wireguard/server.pub
# What's actually running
docker compose exec wireguard wg show wg0 public-key
If keys don't match, run the sync script:
# Automatically sync keys from running WireGuard
./scripts/wg-sync-keys.sh
# Or manually fix:
docker compose exec wireguard wg show wg0 public-key > configs/wireguard/server.pub
# Regenerate user with correct key
./scripts/wg-user-add.sh newuser
Check NAT/masquerade:
Check IP forwarding:
Check firewall allows WireGuard port:
Update MoaV if issue persists:
Older versions had missing iptables rules for return traffic. Update and rebuild:
cd /opt/moav
moav update
docker compose --profile wireguard build --no-cache wireguard
moav restart wireguard
DNS tunnel not working¶
Check dnstt logs for domain issues:
If you see NXDOMAIN: not authoritative for example.com, the domain wasn't set correctly during bootstrap:
# Check the config file
cat configs/dnstt/server.conf
# Should show: DNSTT_DOMAIN=t.yourdomain.com (not example.com)
# If wrong, update it
sed -i 's/example.com/yourdomain.com/g' configs/dnstt/server.conf
# Rebuild and restart dnstt
docker compose build dnstt
docker compose --profile dnstt up -d dnstt
Verify NS delegation:
Test dnstt server:
Check firewall:
Registry/Build Issues¶
Container registry blocked (gcr.io, ghcr.io)¶
In some regions (Iran, Russia, China), certain container registries are blocked:
| Registry | Images Affected | Status |
|---|---|---|
gcr.io |
cAdvisor | Often blocked |
ghcr.io |
clash-exporter | Often blocked |
docker.io |
Most base images | Usually works (mirrors available) |
Symptoms:
- docker pull hangs or times out
- Build fails with "connection refused" or "timeout"
- Monitoring stack won't start
Solution: Build blocked images locally using moav build --local:
# Build commonly blocked images (gcr.io, ghcr.io)
moav build --local
# Build specific image
moav build --local cadvisor
moav build --local clash-exporter
# Build ALL external images locally
moav build --local all
Building images locally¶
MoaV can build monitoring stack images from source when registries are blocked.
Available images for local build:
| Image | Registry | Build Command |
|---|---|---|
| cAdvisor | gcr.io | moav build --local cadvisor |
| clash-exporter | ghcr.io | moav build --local clash-exporter |
| Prometheus | docker.io | moav build --local prometheus |
| Grafana | docker.io | moav build --local grafana |
| Node Exporter | docker.io | moav build --local node-exporter |
| Nginx | docker.io | moav build --local nginx |
| Certbot | docker.io | moav build --local certbot |
How it works:
1. Downloads pre-built binaries from GitHub releases (not blocked)
2. Creates a local Docker image
3. Updates .env to use the local image
Version control:
Set versions in .env before building:
# In .env
PROMETHEUS_VERSION=3.5.1
GRAFANA_VERSION=12.3.3
NODE_EXPORTER_VERSION=1.10.2
CADVISOR_VERSION=0.56.2
CLASH_EXPORTER_VERSION=0.0.4
Force rebuild:
Build everything locally (no registry pulls):
This builds both MoaV services and all external monitoring images.
Monitoring Issues¶
Warning: The monitoring stack nearly doubles MoaV's resource requirements. While MoaV alone runs on 1 vCPU / 1GB RAM, adding monitoring requires at least 2 vCPU / 2GB RAM for stable operation.
System hangs after starting monitoring¶
If your server hangs or becomes unresponsive after starting monitoring (especially the first time), you're likely running out of RAM.
Symptoms: - SSH connection freezes - Commands stop responding - Server becomes unreachable
Solution 1: Recover and disable monitoring
If you can still SSH in (wait a few minutes):
# Stop all monitoring services
docker compose --profile monitoring stop
# Or stop individual heavy services
docker stop moav-prometheus moav-grafana moav-cadvisor
If SSH is frozen, reboot via your VPS control panel, then:
Solution 2: Upgrade your server
Monitoring requires at least 2GB RAM. Upgrade your VPS to 2GB+ RAM before enabling monitoring.
Solution 3: Run lighter monitoring
If you must have metrics on 1GB RAM, disable the heaviest components:
# Start only essential monitoring (skip cAdvisor)
docker compose --profile monitoring up -d prometheus grafana node-exporter clash-exporter
# Stop cAdvisor if running (uses ~150MB)
docker stop moav-cadvisor
Grafana shows "No Data"¶
-
Check Prometheus is running:
-
Verify targets are up - access Prometheus internally:
-
Ensure services are on the same Docker network (
moav_net)
Clash-exporter authentication error (401)¶
Symptoms:
This means CLASH_API_SECRET in .env doesn't match the secret in sing-box's config. This typically happens after a re-bootstrap where the state volume has a different secret than .env.
Diagnose:
# What .env has (used by clash-exporter)
grep CLASH_API_SECRET .env
# What sing-box actually uses (source of truth)
docker compose exec sing-box cat /etc/sing-box/config.json | python3 -m json.tool | grep -A2 clash_api
Fix:
# Sync .env with the actual sing-box secret
SECRET=$(docker compose exec sing-box cat /etc/sing-box/config.json | python3 -c "import sys,json; print(json.load(sys.stdin)['experimental']['clash_api']['secret'])")
sed -i "s/^CLASH_API_SECRET=.*/CLASH_API_SECRET=$SECRET/" .env
docker compose restart clash-exporter
Or use moav restart monitoring — the ensure_clash_api_secret() function now auto-syncs stale secrets from the state volume on startup.
High memory usage from cAdvisor¶
Limit cAdvisor resources in docker-compose.yml:
Snowflake metrics showing zeros¶
The Snowflake exporter parses log files for summary statistics. Summaries are logged periodically. If you just started Snowflake, wait for the first summary to appear:
# Check if summaries exist
docker exec moav-snowflake cat /var/log/snowflake/snowflake.log | grep "In the"
WireGuard exporter not starting¶
The exporter needs read access to WireGuard config. Check:
For complete monitoring documentation, see MONITORING.md.
MoaV Test/Client Issues¶
moav test fails to build¶
Docker build errors:
# Rebuild with no cache
moav client build --no-cache
# Or manually:
docker build --no-cache -t moav-client -f Dockerfile.client .
Network issues during build: - Pre-built binaries are downloaded from GitHub/GitLab - If downloads fail, the build falls back to compiling from source (slower) - Check your server has internet access
moav test shows all protocols as "skip"¶
User bundle not found:
Bundle path issue:
# Verify the bundle contains config files
ls outputs/bundles/user1/
# Should contain: reality.txt, trojan.txt, hysteria2.yaml, etc.
moav test shows "sing-box failed to start"¶
Configuration format issue:
- sing-box 1.12+ requires route.final instead of deprecated special outbounds
- Check sing-box version: docker run --rm moav-client sing-box version
Debug with verbose output:
# Run test container interactively
docker run --rm -it \
-v "$(pwd)/outputs/bundles/user1:/config:ro" \
moav-client /bin/bash
# Inside container, manually test
VERBOSE=true CONFIG_DIR=/config /app/client-test.sh
moav client connect can't establish connection¶
Check server is running:
Try different protocols:
Check firewall on server:
Client proxy ports already in use¶
Change ports in .env:
Or stop conflicting service:
WireGuard test shows "endpoint not reachable"¶
This is expected if: - UDP port 51820 is blocked by firewall - Server WireGuard container is not running
Check WireGuard is running:
Check UDP is not blocked:
Tor/Snowflake fallback not working¶
Tor is standalone and doesn't require your server:
If binaries are missing: - Some optional binaries may fail to download during build - Check build logs for "not available (optional)" messages
For Psiphon: - Psiphon is not available via MoaV client - Use the official Psiphon apps instead
Client-Side Issues¶
Can't connect at all¶
-
Verify server is reachable:
-
Check if IP is blocked:
- Try from a different network (mobile data)
-
Use online tools to check if IP is accessible from Iran
-
Try different protocols: Reality → Hysteria2 → Trojan → WireGuard → DNS tunnel
TLS handshake timeout¶
Causes: - Server certificate issue - Deep packet inspection blocking - Server overloaded
Solutions: 1. Try Reality protocol (doesn't use your cert) 2. Try Hysteria2 (uses UDP) 3. Check server certificate is valid
Slow connection¶
Hysteria2 often helps - it's optimized for lossy networks.
For sing-box clients: - Enable multiplexing - Try different congestion control
Check server resources:
Frequent disconnections¶
- Enable keep-alive:
-
In client app, look for "persistent connection" or "keep-alive"
-
Check server uptime:
-
Check for IP blocks:
- ISP may be actively disrupting connections
- Try rotating to a new server IP
"Invalid config" errors¶
- Ensure you're using the correct link for your app
- Check for extra spaces or newlines in the link
- Try importing the JSON file instead of the link
Network-Specific Issues¶
Works on WiFi but not mobile data¶
Mobile carriers may have different filtering: - Try Hysteria2 (UDP-based) - Try DNS tunnel - Some carriers block all VPN signatures
Works on mobile data but not WiFi¶
Home ISPs often have stricter filtering: - Try Reality protocol - Try different Reality target sites - Try port 80 or other ports (if configured)
Very slow despite connection working¶
- Check if throttled:
- Speed test without VPN
- Speed test with VPN
-
If VPN is significantly slower, you're being throttled
-
Try Hysteria2:
- Uses UDP which is sometimes less throttled
-
Has built-in congestion control
-
Try different times:
- Filtering may be heavier during peak hours
highly censored environments-Specific Issues¶
All protocols blocked¶
When ISP blocks everything:
- DNS Tunnel - Often still works as it's hard to block all DNS
- Different Reality targets - Try:
www.apple.comdl.google.comwww.samsung.com-
update.microsoft.com -
Get a new server - Your IP may be specifically blocked
Protocol detected and blocked¶
Signs your protocol is detected: - Works for a few minutes then dies - Works initially then stops - Specific protocol fails but others work
Solutions: 1. Switch protocols immediately 2. Change Reality target domain 3. Update to latest sing-box version (better anti-detection)
Total internet shutdown¶
During major events, Govs sometimes shuts internet entirely:
- DNS tunnel might still work (if any DNS works)
- Satellite internet (Starlink) if available
- Wait for restoration
Reset and Re-bootstrap¶
Full reset (start fresh)¶
If things are broken beyond repair, reset everything:
# Complete wipe - removes all containers, volumes, configs, keys, bundles
moav uninstall --wipe
# Reconfigure
cp .env.example .env
nano .env # Set DOMAIN, ACME_EMAIL, ADMIN_PASSWORD
# Fresh bootstrap
./moav.sh
This gives you a completely clean installation with new keys and certificates.
Partial reset (keep data)¶
Remove containers but keep your configuration for quick reinstall:
# Remove containers only, keep .env, keys, bundles
moav uninstall
# Reinstall and start
./moav.sh install
moav start
Re-bootstrap only¶
To regenerate server config without removing anything:
# Remove only the bootstrap flag
docker run --rm -v moav_moav_state:/state alpine rm /state/.bootstrapped
# Re-run bootstrap
moav bootstrap
# Restart services
moav restart
Reset only WireGuard¶
# Remove WireGuard config
rm configs/wireguard/wg0.conf configs/wireguard/server.pub
# Remove WireGuard keys from state
docker run --rm -v moav_moav_state:/state alpine rm -f /state/keys/wg-server.key /state/keys/wg-server.pub
# Remove bootstrap flag and re-run
docker run --rm -v moav_moav_state:/state alpine rm /state/.bootstrapped
docker compose --profile setup run --rm bootstrap
# Restart WireGuard
docker compose --profile wireguard up -d wireguard
Reset only dnstt¶
# Remove dnstt config and keys
rm configs/dnstt/server.conf configs/dnstt/server.pub
docker run --rm -v moav_moav_state:/state alpine rm -f /state/keys/dnstt-*
# Remove bootstrap flag and re-run
docker run --rm -v moav_moav_state:/state alpine rm /state/.bootstrapped
docker compose --profile setup run --rm bootstrap
# Restart dnstt
docker compose --profile dnstt up -d dnstt
Common Commands¶
View logs¶
# All services
docker compose logs
# Specific service
docker compose logs sing-box
docker compose logs -f sing-box # Follow
# Last 100 lines
docker compose logs --tail=100 sing-box
Restart services¶
# Restart all (specify the profile you're using)
docker compose --profile all restart
# Restart specific service
docker compose --profile proxy restart sing-box
# Full rebuild
docker compose --profile all down
docker compose --profile all up -d --build
Apply .env changes¶
Important: Docker caches environment variables at container creation time. Simply restarting a service does NOT pick up .env changes.
# WRONG - does NOT apply .env changes
docker compose restart snowflake
# CORRECT - recreates container with new .env values
docker compose up -d --force-recreate snowflake
# Or use moav (handles this automatically)
moav stop snowflake && moav start snowflake
Check resource usage¶
Test connectivity¶
# Test from server
curl -I https://google.com
# Test TLS
openssl s_client -connect yourdomain.com:443
# Test specific protocol
# (run from a client that works)
Reload configuration¶
# sing-box hot reload
docker compose exec sing-box sing-box reload
# Or restart container
docker compose restart sing-box
Getting Help¶
If issues persist:
-
Collect logs:
-
Check configuration:
-
Verify network:
-
Document:
- What protocol you're trying
- What client app and version
- Error messages
- When it started failing