We have hundreds of stacks. More than half of them are in an error state. Most of those correspond to development/test infrastructure, and staffing a team to do the cleanup would be expensive.
Lots of those stacks are “someone manually upgraded the database, so we need to create a PR to reflect the new DB version” or “this resource got created manually, and now the apply is failing due to the duplicate resource creation attempt.”
In addition to failing stacks, some stacks are simply in disrepair. Missing imports, missing move blocks, and other defects mean that maintaining stack state is difficult.
Obviously teams using Terraform should be consistent at using Terraform, rather than performing manual operations, but sometimes that’s significantly more expensive/time-consuming than clicking a button in AWS for non-production systems.
There’s also a very large category of refactor-type work that’s very expensive to do manually, where you want to restructure some modules or move resources across stacks.
I’d like to propose that these tasks are exceptionally appropriate for LLM agent-based workflows. We already have the technology to restrict permissions for terraform plans to dramatically minimize the risk of malicious configuration being introduced at that phase. Spacelift already has sophisticated workflow approvals. Often, the thing I want is for something or someone to operate in a loop making configuration changes and manipulating the state until terraform plan produces a plan with no delta, e.g. “No changes. Your infrastructure matches the configuration.“
That’s an incredibly clear objective to give to an LLM! I’d be thrilled if Spacelift implemented tooling to help teams achieve better resource management.
Please authenticate to join the conversation.
👀 In Review
💡 Feature Requests
About 3 hours ago
Get notified by email when there are changes.
👀 In Review
💡 Feature Requests
About 3 hours ago
Get notified by email when there are changes.