Secrets Store CSI Driver for EKS, isn't there anything better?
I believe that you have come here because you were trying to find a better tool to get your secrets from AWS Secret Manager to your EKS cluster. I can’t blame you. I also spent some time with CSI Driver wondering if this is really the road I should take. Maybe, after reading this small blog post, you will find out that the use-case I had was different from yours and this was a waste of time. Long story short, I ended up with External Secrets and here is the story why.
What’s wrong with the CSI Driver
There are a few issues which convinced me there must be a better way to get my secrets from the cloud. I am not saying that those are yours as well, but in my setup, those were rather significant.
Allocations
Right after deploying required helm charts, I noticed a daemonset secret-provider-aws-secrets-store-csi-driver-provider
was deployed. I can be a bit upset about the name, which could be half as short and saying the same thing. Anyway, the allocations are as follow:
resources:
limits:
limits:
cpu: 50m
memory: 100Mi
requests:
cpu: 50m
memory: 100Mi
It ain’t that much. The problem is that this takes place on every node in the cluster. There are a lot of services on each node already. The size of CPU and memory which remains available for your application is shrinking.
But there is a second daemonset which gets deployed named csi-secrets-store-secrets-store-csi-driver
. Again, wonderful name if you ask me. The allocations are as follows:
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 10m
memory: 20Mi
For some reason this pod is not as significant as the one from secret-provider-aws-secrets-store-csi-driver-provider
. It has QOS class Burstable, so there can be small issues with scheduling at times when the cluster has a hard time with load.
I know that there are good reasons why the requests and limits are there. I do believe that somebody gave it some thought and figured this makes sense. But I just wanted something which takes the secrets from the AWS Secret Manager and creates exactly the same secrets in my Kubernetes cluster. Do you need that much resources for that? I don’t think so.
Debugging
I gave it a try. It seemed to me this is the only tool AWS is officially supporting and it must be good. Problem was the issues are not that easy to debug. I created a proof of concept setup without any issues. Just follow the documentation and it will work. Then I tried to integrate the solution into the already existing application. I received first error from the pod describe:
MountVolume.SetUp failed for volume "secrets-store" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod ns/migrations-tczkj, err: error connecting to provider "aws": provider not found: provider "aws"
That’s odd. Almost the same setup worked in the proof of concept. I tried to run a describe command on the SecretProviderClass
object. Were there any events to give me a hint? No! I tried googling the issue. No luck there either. I told myself that this is something other than a missing provider and tried to deploy my proof of concept to a different namespace.
And that’s where I received a second error:
MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod ns/secrets-testing-app-6f6744dd4-6tsjr, err: rpc error: code = Unknown desc = Failed to fetch secret from all regions: arn:aws:secretsmanager:eu-west-1:...:secret:testing-secret
ARN matched my secret and the region was also correct. What’s wrong? I tried to google the issue and again without much success.
At that point, I gave up and installed the External Secrets operator. I deployed a small setup just to see if everything is running as expected. After deploying the first SecretStore
I wasn’t able to deploy any ExternalSecret
objects. So I tried describe on the SecretStore
and there it was, simple, constructive and well written log line:
Warning ValidationFailed 7m54s secret-store WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 466e06d0-55de-4f44-b7e0-48573084dda8
I messed up the provisioning of the IAM Trusted entities policy for the IAM role. Instead of allowing multiple namespaces, I left there only the namespace for deploying the proof of concept. See documentation at github repository. Instead of having the following setup:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::...:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/…"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": { "oidc.eks.eu-west-1.amazonaws.com/id/…:sub": "system:serviceaccount:secrets-testing:secrets-accessor"
}
}
]
}
In StringLike
I changed the line to "system:serviceaccount:*:secrets-accessor"
which enables any service account named secrets-accessor
in any namespace to assume the policy. Everything started to work as it should. Why I didn’ť get any reasonable log line from the CSI Driver, I don’t know.
What’s good about the CSI Driver
The good thing is that the CSI setup forces you to have a service account for each pod which has a permission to access the secret in the cloud. This makes the access much more granular and easier to manage. The External Secrets operator is using one service account in my setup, which might create security concerns.
The External Secrets operator is not technically providing access to the pods. It just creates kubernetes secrets for you. Here is an example of the ClusterSecretStore
:
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: secrets-accessor-secret-manager
spec:
provider:
aws:
auth:
jwt:
serviceAccountRef:
name: secrets-accessor
namespace: external-secrets
region: eu-west-1
service: SecretsManager
Where ExternalSecret
resource creates only the mapping from the AWS Secret Manager. The referred service account has to have the mapping to the IAM role with the permissions which are the same as with the CSI Driver deployment.
ExternalSecret
can then be deployed without any reference to service account:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: secrets-backend
spec:
dataFrom:
- extract:
key: name of the secret in the secret manage
version: AWSCURRENT
secretStoreRef:
kind: ClusterSecretStore
name: secrets-accessor-secret-manager
target:
creationPolicy: Owner
deletionPolicy: Retain
name: secrets-backend
Summary
Overall, I would suggest considering External Secrets over the CSI Driver setup. If you have security concerns, deploy the CSI Driver and be ready to have a few debugging afternoons. I also mentioned allocations of resources done by CSI Driver pods. The External Secrets operator does not deploy resource allocation by default, but there is a value in the helm chart for that (.Values.resources)
.
Hopefully, this post will help somebody decide what tool for accessing AWS Secret Manager to use. I am not saying my use-case was best but it works for me. Anyway, like and subscribe.