A developer cleaning up a test environment accidentally includes a production KMS key in their deletion script. The key is scheduled Friday afternoon with the default 7-day window. The production team notices encrypted data becoming inaccessible the following Tuesday — 3 days left before the key is gone permanently.
This happens. The recovery window is too short for normal detection cycles. Weekends, holidays, and cross-team dependencies mean 7 days disappears fast.
The fix is a single SCP that denies any deletion window under 30 days:
{
"Sid": "KMSShortDel",
"Effect": "Deny",
"Action": "kms:ScheduleKeyDeletion",
"Resource": "*",
"Condition": {
"NumericLessThan": {
"kms:ScheduleKeyDeletionPendingWindowInDays": "30"
}
}
}
With this in place, aws kms schedule-key-deletion --pending-window-in-days 7 returns Access Denied. The minimum accepted value is 30.
Full SCP with production key protection
The deletion window SCP plus an additional statement that blocks deletion and disabling of production-aliased keys:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "KMSShortDel",
"Effect": "Deny",
"Action": "kms:ScheduleKeyDeletion",
"Resource": "*",
"Condition": {
"NumericLessThan": {
"kms:ScheduleKeyDeletionPendingWindowInDays": "30"
}
}
},
{
"Sid": "KMSProductionProtection",
"Effect": "Deny",
"Action": [
"kms:ScheduleKeyDeletion",
"kms:DisableKey"
],
"Resource": "*",
"Condition": {
"ForAllValues:StringLike": {
"kms:ResourceAliases": [
"alias/prod-*",
"alias/production-*"
]
}
},
"Principal": "*"
}
]
}
The KMSProductionProtection statement blocks both deletion scheduling and key disabling for anything with a prod- or production- alias prefix, regardless of the deletion window.
Rollout
Apply this in stages to catch legitimate uses of short windows before they break in production.
Week 1 — non-production OUs. Development and staging accounts only. Watch for legitimate automation that uses --pending-window-in-days 7. Update those scripts now.
Week 2 — production OUs. Communicate the change. Update runbooks. Any existing automation that schedules short-window deletions will start failing here.
Week 3 — root OU. Organization-wide. From this point, no account in the org can schedule a KMS deletion with less than 30 days.
Multi-region enforcement
The core deletion window condition works globally. If you want to be explicit about which regions are covered:
{
"Condition": {
"NumericLessThan": {
"kms:ScheduleKeyDeletionPendingWindowInDays": "30"
},
"StringEquals": {
"aws:RequestedRegion": [
"us-east-1",
"us-west-2",
"eu-west-1"
]
}
}
}
Omitting StringEquals on region is usually correct — it means the denial applies everywhere, which is what you want.
Monitoring
CloudWatch doesn’t have a native ScheduleKeyDeletion metric, but CloudTrail does record every kms:ScheduleKeyDeletion call. Pipe CloudTrail into EventBridge and alert on it:
import boto3
def create_kms_deletion_alarm():
cloudwatch = boto3.client('cloudwatch')
cloudwatch.put_metric_alarm(
AlarmName='KMS-Key-Deletion-Scheduled',
ComparisonOperator='GreaterThanThreshold',
EvaluationPeriods=1,
MetricName='ScheduleKeyDeletion',
Namespace='AWS/KMS',
Period=300,
Statistic='Sum',
Threshold=0.0,
ActionsEnabled=True,
AlarmActions=[
'arn:aws:sns:us-east-1:123456789012:security-alerts'
],
AlarmDescription='Alert when KMS key deletion is scheduled'
)
The SCP stops short-window deletions. The alarm tells you when anyone schedules a deletion at all, even with the 30-day window. Both signals are useful.
Common objections
“30 days is too long for dev/test keys.” Apply a less restrictive threshold to development OUs:
{
"Sid": "KMSDevShortDel",
"Condition": {
"NumericLessThan": {
"kms:ScheduleKeyDeletionPendingWindowInDays": "14"
}
}
}
“Our automation uses 7-day windows.” Fix the automation. It’s one sed command per script:
sed -i 's/--pending-window-in-days 7/--pending-window-in-days 30/g' cleanup-scripts/*.sh
“We need to delete a compromised key immediately.” Disable it immediately with kms:DisableKey — that stops it being used for new encrypt/decrypt operations right now. Then schedule deletion with the 30-day window. Disabling is the immediate security response; deletion is the cleanup.
aws kms disable-key --key-id alias/compromised-key
aws kms schedule-key-deletion --key-id alias/compromised-key --pending-window-in-days 30
“What about genuine break-glass scenarios?” Move the account temporarily out of the SCP-protected OU, perform the operation with approval, move it back. Log every step. That process is your audit trail for why an exception was granted.
This SCP costs nothing to deploy and requires no ongoing maintenance. It prevents the class of incident where encrypted data becomes permanently inaccessible because a deletion window was too short. The only reason not to deploy it is if you have existing automation that schedules short-window deletions — and fixing that automation is the right call anyway.