We use Grafana and Prometheus heavily for day to day monitoring and alerting. Today Spacelift has dashboards in the UI and there is a Prometheus exporter, but the metric coverage is limited and the UI is mostly something we open only when something is already broken.
We want a standard way to export richer Spacelift operational metrics so we can proactively detect regressions and performance changes, for example when a workflow suddenly becomes slower over time, or when a specific stack starts taking longer after a change.
A concrete example. We recently spent time optimizing Ansible execution time. We want to make sure it does not regress in the future. Having metrics we can store and alert on would help a lot.
What we want (ideal solution): A Prometheus compatible endpoint or exporter with broader, well documented metrics, so we can scrape them and build Grafana dashboards and alerts. Alternatively, a direct Grafana integration is fine, but a standard metrics endpoint is best.
Please authenticate to join the conversation.
π In Review
π‘ Feature Requests
2 months ago
Get notified by email when there are changes.
π In Review
π‘ Feature Requests
2 months ago
Get notified by email when there are changes.