Improve debugging information when worker crash

IE, you can check http://169.254.169.254/latest/meta-data/instance-action to get info that EC2 instance start terminating/stopping/rebooting
(Also, http://169.254.169.254/latest/meta-data/spot/termination-time available only for spot instances)Ideally, show message why it was terminated ("Instance terminated by ASG AZ rebalancing", etc) but it could requires much more work

Workaround
Spent months to debug with SP support and developers to figureout what can be a root cause
Problem
Observability and debugging issues

Please authenticate to join the conversation.

Upvoters
Status

✅ Completed

Board

💡 Feature Requests

Date

Over 1 year ago

Subscribe to post

Get notified by email when there are changes.