Understanding the Problem
Issue Summary
A custom or third-party systemd service on Ubuntu fails to start during boot but starts successfully via systemctl start after login. This leads to failed application availability, delayed monitoring agent startups, or partial service availability in clustered environments.
Common Symptoms
- Service logs show "dependency failed" or "network unavailable" errors
systemctl statusshows "inactive (dead)" at boot timejournalctl -xereports race conditions or "unit not found"- Manual start works fine post-login
Root Cause Analysis
Systemd Boot Semantics
Systemd units are managed in parallel unless explicitly serialized via dependencies. Services relying on network interfaces, mounts, or sockets can attempt to start before their requirements are fully ready—especially in fast-booting systems or cloud VMs.
Dependency Graph Problems
Incorrect or missing After=, Requires=, or Wants= directives in the service file result in systemd launching services prematurely.
[Unit] Description=Custom Daemon After=network.target Requires=network.target
Race Conditions and Timeouts
Services depending on interfaces like eth0 or docker0 might launch before DHCP or virtual interfaces are ready. Cloud-init or Netplan may not complete configuration before the service starts.
Architectural Implications
System Reliability and Observability
Critical services failing at boot result in alert fatigue and brittle environments. These failures are hard to trace because post-boot state appears clean, masking underlying race conditions.
Cluster and Microservice Impact
In HA setups or service meshes, delayed service registration affects discovery and auto-healing. System readiness must be deterministic for orchestrators to function correctly.
Diagnostics and Verification
Use systemd-analyze
systemd-analyze blame systemd-analyze critical-chain
This identifies startup delays and highlights which services blocked or failed early in the boot sequence.
Check Unit File Consistency
systemctl cat my-service.service systemd-analyze verify /etc/systemd/system/my-service.service
Verify that all Requires and After directives align with actual target units present in the system.
Monitor Logs Early in Boot
journalctl -b -0 -u my-service.service
This reveals early boot issues that do not appear once the service is started manually.
Step-by-Step Fix
1. Harden Service Dependencies
Replace generic network.target with network-online.target if the service needs full connectivity.
[Unit] After=network-online.target Wants=network-online.target
2. Add Explicit Dependencies
Ensure dependent services or mounts are listed in Requires= or Wants= to avoid soft failures.
3. Use systemd service conditions
Delay service execution until conditions are met using ConditionPathExists, ExecStartPre, or scripts that poll readiness.
4. Extend Timeouts if Necessary
[Service] TimeoutStartSec=90 ExecStartPre=/usr/local/bin/check-network.sh
Check network reachability or other required system states before launching the service.
5. Rebuild Daemon and Reload
systemctl daemon-reexec systemctl daemon-reload systemctl enable my-service
Best Practices
- Use
network-online.targetinstead ofnetwork.targetfor connectivity-based services - Always verify service unit dependencies using
systemd-analyze verify - Use
ExecStartPreto script critical prechecks - Do not rely on login sessions or user timers for service readiness
- In cloud environments, ensure proper cloud-init finalization before dependent services start
Conclusion
Systemd is a powerful but strict initialization system. Misconfigured unit files or ambiguous dependencies can silently delay or prevent service startup at boot, leading to erratic behavior in production. By adopting deterministic boot sequencing and verifying system dependencies with appropriate tooling, Ubuntu administrators can eliminate this class of error and build more resilient operating system configurations.
FAQs
1. Why does my service only fail at boot but runs manually?
At boot, required dependencies like network interfaces or mounts may not be ready. Manual starts occur after these resources are available, hiding timing issues.
2. How do I delay a systemd service until networking is ready?
Use After=network-online.target and Wants=network-online.target in your unit file. Also ensure systemd-networkd-wait-online.service is enabled if applicable.
3. Can I visualize service startup timing?
Yes. Use systemd-analyze blame and systemd-analyze critical-chain to see boot-time delays and dependencies.
4. What is the role of systemd-analyze verify?
It checks for syntactic and logical errors in unit files, ensuring dependencies refer to valid targets and preventing misbehavior during boot.
5. Is network.target sufficient for networking services?
Not always. network.target signals basic networking availability, but network-online.target ensures full connectivity, including IP assignment and route readiness.