This pull request provides support for getting credentials out of hashicorp vault directly from quarkus when using the agroal connection pool.
The initial discussion is here:
https://groups.google.com/d/msg/quarkus-dev/Izy6KyqNtrk/xyDSIBFfBQAJ
I believe the main points have been addressed.
I have developed a sample application that demonstrates its use:
https://github.com/vsevel/vaultapp
From an application perspective, it is as easy as:
- configuring the vault server access:
quarkus.vault.url=https://vault.vault.svc:8200
quarkus.vault.role=myapprole
- configuring a datasource that uses a password stored in vault:
quarkus.datasource.driver=org.postgresql.Driver
quarkus.datasource.url=jdbc:postgresql://postgres:5432/mydb
quarkus.datasource.max-lifetime=1H
quarkus.datasource.username=myuser
quarkus.datasource.vault.secret-path=foo
- or configuring a datasource to use dynamically generated credentials (username and password):
quarkus.datasource.driver=org.postgresql.Driver
quarkus.datasource.url=jdbc:postgresql://postgres:5432/mydb
quarkus.datasource.max-lifetime=1H
quarkus.datasource.vault.dbrole=mydbrole
depending on the situation, vault has to be configured differently. Please refer to the vaultapp example application, or the VaultTestExtension that supports IT tests.
This PR honors edge cases such as:
- renewing leases/tokens when the ttl is reached
- recreating leases/tokens when the max-ttl is reached or a revocation has been processed
This level of functionality has been made possible by https://issues.jboss.org/browse/AG-116 that was developed in agroal for that use case, plus the max-lifetime feature that forces a connection to be recycled on a regular basis (and allows to go through vault's renewal logic without implementing some kind of timer mechanism in VaultPassword). To benefit from this feature, I had to expose it through a new property maxLifetime in DataSourceRuntimeConfig.
It means that you can use static passwords in vault and revoke the client token, or dynamic credentials and revoke the lease, and force client applications to go through the secret acquisition logic, without any downtime.
Each datasource will have its own VaultPassword (VaultPassword instances are not shared across multiple datasources, which is not an issue because they are lightweight objects).
We support using secrets on the default datasource, and also named datasources.
2 auth mechanisms are supported:
- kubernetes: see https://www.vaultproject.io/docs/auth/kubernetes.html
- userpass: for testing, or outside k8s. see https://www.vaultproject.io/docs/auth/userpass.html
The kubernetes auth is configured by default, and leverages the standard jwt token located at:
/var/run/secrets/kubernetes.io/serviceaccount/token
Even if k8s is the primary target, it is working perfectly as a standalone container, or even as a good old java program running outside docker.
Other auth mechanisms (see https://www.vaultproject.io/docs/auth/index.html) can be easily added in VaultAuthService.
see VaultAuthService
TLS is supported, and active by default. A tls-skip-verify mode has been added as well. And it is possible also to deactivate tls all together (dev mode).
see VaultHttpClientFactory
If running in k8s we can leverage automatically the cacert bundle located here:
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
For all properties there are sensible defaults, to ease deployment in k8s in particular.
see VaultServerRuntimeConfig
For static passwords, I support the kv engine version 1 and 2 (with versioning).
see https://www.vaultproject.io/docs/secrets/kv/index.html
There is a very simple cache for fetched secrets, with a cache-period that can be set (default=10mins). The cache gets used in the usual situation where we are within the cache period, but also if we get an http exception or a forbidden exception when we attempt to contact the vault. In that case the application will be allowed to use the last known value.
The http client allows to set the read timeout (default is 500ms) and the connect timeout (default is 5000ms).
Most of the logging is done in debug, and there is a log-confidentiality-level property that can be set to print out confidential informations in dev mode.
I have created integration tests that leverage testcontainers to start and configure a complete vault+postgres system.
see VaultTestExtension and Vault*ITCase classes.
I just saw that the windows build was failing with an error on getting the testcontainers lib to work:
WARN: Failure when attempting to lookup auth config (dockerImageName: quay.io/testcontainers/ryuk:0.2.3, configFile: C:\Users\VssAdministrator\.docker\config.json. Falling back to docker-java default behaviour. Exception message: C:\Users\VssAdministrator\.docker\config.json (The system cannot find the path specified)
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.254 s <<< FAILURE! - in io.quarkus.agroal.test.VaultITCase
[ERROR] io.quarkus.agroal.test.VaultITCase Time elapsed: 4.253 s <<< ERROR!
com.github.dockerjava.api.exception.DockerClientException: Could not pull image: failed to register layer: re-exec error: exit status 1: output: ProcessBaseLayer \\?\C:\ProgramData\docker\windowsfilter\6aaa4d9cf3d93098b15697d0b0b820056b63c44cf6ae1104f9048086bc6985dd: The system cannot find the path specified.
Not sure about that.
The integration test was by far the most challenging to write, and that is why I spent most of my efforts on it. I will complete the testing side with unit tests very shortly.
I have done quite a bit of manual testing as well using the example application vaultapp.
Future improvements to be discussed:
- use the resteasy client extension to call vault (instead of configuring our own http client)
- make vault an extension of its own, and get agroal to depend on it, which would allow one vault
instance to be shared inside quarkus, plus would open up some other use cases (eg certificates mgmt) by leveraging the other secret engines.
The only minor glitch I encountered was that for some reasons I could not get the org.testcontainers.postgresql artifact to use version ${test-containers.version}, but instead had to hardcode it in the pom. Something that is probably easy to figure out.
I am hoping you can see the value of this PR, given that:
- vault is a standard in a microservice world
- I have tried to respect quarkus's spirit, specifically on the ease of use side
- plus I have spent quite a bit of hours to provide production quality code ;-)
Please reach out if you think this PR has the potential to make its way into the product.
Regards,
Vincent
Fixes #2764
release/noteworthy-feature