PBKDF2 Upgrade and CRYPT Format Compatibility
This document describes the proposed upgrade path for PBKDF2 usage in
light-4j, with special focus on encrypted configuration values that use the
CRYPT prefix.
The immediate operational issue is compatibility. The new light-encryptor
emits only the four-part AES-GCM format, while existing deployed configuration
may still contain the legacy three-part AES-CBC format. The light-4j
decryptor should read both formats so services can upgrade safely without
requiring every encrypted configuration value to be regenerated at the same
time.
Background
There are two separate PBKDF2 use cases in light-4j:
AESSaltDecryptorderives an AES key from a configured master password for decrypting encrypted configuration values.HashUtilgenerates and validates PBKDF2 password or API-key hashes.
These two paths should not be upgraded in exactly the same way. OWASP password
storage guidance applies directly to password and API-key hashes, where PBKDF2
is the password hashing function. The CRYPT path uses PBKDF2 as a key
derivation function before AES encryption. It has its own compatibility
constraints because the encrypted value format historically did not include an
algorithm version or iteration count.
Current OWASP guidance recommends PBKDF2-HMAC-SHA256 with 600,000 iterations
for password storage. See the OWASP Password Storage Cheat Sheet:
https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html
Current CRYPT Formats
Legacy Three-Part Format
The legacy format is:
CRYPT:<hex-salt>:<hex-ciphertext>
The legacy implementation uses:
PBKDF2WithHmacSHA256- 65,536 iterations
- 256-bit AES key
AES/CBC/PKCS5Padding- a static zero IV
This format is still present in older deployed service configuration and older test fixtures. It should be treated as deprecated but readable.
The main security weakness is the static zero IV in CBC mode. With a fixed IV, the same plaintext encrypted with the same derived key produces the same first ciphertext block. That can reveal equality or common prefixes across encrypted configuration values. For this reason, legacy CBC support must be read-only and the runtime should warn when it decrypts a legacy value.
Current Four-Part Format
The current light-encryptor output format is:
CRYPT:<hex-salt>:<hex-iv>:<hex-ciphertext-and-tag>
The current implementation uses:
PBKDF2WithHmacSHA256- 65,536 iterations
- 256-bit AES key
AES/GCM/NoPadding- random 16-byte salt
- random 12-byte GCM IV
- 128-bit GCM authentication tag
This is the only format that light-encryptor should emit going forward.
Goals
- Allow
light-4jruntime decryptors to read both legacy three-partCRYPTvalues and current four-partCRYPTvalues. - Keep
light-encryptorsimple by generating only the current four-part AES-GCM format. - Avoid breaking existing checked-in or deployed encrypted configuration during framework upgrades.
- Make the future PBKDF2 cost upgrade explicit instead of overloading an unversioned format.
- Preserve clear error messages that identify invalid
CRYPTformatting.
Non-Goals
- Do not make
light-encryptorgenerate legacy three-part values. - Do not silently re-encrypt configuration files at runtime.
- Do not change the meaning of existing four-part
CRYPT:salt:iv:hashvalues without adding a versioned format. - Do not apply OWASP password-storage iteration guidance blindly to every PBKDF2 use without measuring startup and request-time cost.
Design
1. Add Format Dispatch in AESSaltDecryptor
AESSaltDecryptor.decrypt(String input) should parse the token count and route
to the correct decryptor:
parts.length == 3 -> decryptLegacyCbc(parts)
parts.length == 4 -> decryptGcm(parts)
otherwise -> invalid CRYPT format
Use split(":", -1) so malformed values with empty fields are still detected
as malformed fields instead of being accidentally normalized by String.split.
Suggested method shape:
public String decrypt(String input) {
if (input == null || !input.startsWith(CRYPT_PREFIX + ":")) {
throw new RuntimeException("Unable to decrypt, input string does not start with 'CRYPT:'.");
}
String[] parts = input.split(":", -1);
return switch (parts.length) {
case 3 -> decryptLegacyCbc(parts);
case 4 -> decryptGcm(parts);
default -> throw invalidCryptFormat();
};
}
The error text should mention both accepted formats:
CRYPT:salt:hash
CRYPT:salt:iv:hash
The implementation should validate that required fields are non-empty before hex decoding. A malformed value should fail as a format error, while a bad master password, corrupted ciphertext, or failed GCM authentication tag should fail as a cryptographic error. The public exception message should remain generic enough that it does not disclose secret material, but the internal log message should let an operator distinguish invalid structure from failed decryption.
2. Keep Legacy CBC Decryption Read-Only
The legacy path should preserve the previous behavior exactly:
- parse salt from
parts[1] - parse ciphertext from
parts[2] - derive the AES key with
PBKDF2WithHmacSHA256, 65,536 iterations, 256-bit key - decrypt with
AES/CBC/PKCS5Padding - use the legacy static zero IV
This path exists only to keep existing configuration readable. It should not be
used by light-encryptor for new values.
When this path is used, the decryptor should log a throttled warning, ideally once per JVM or once per unique legacy value:
Legacy three-part CRYPT value decrypted. Regenerate this secret with the current
light-encryptor to migrate to AES-GCM.
The warning must not include the encrypted value, plaintext secret, derived key, master password, salt, or ciphertext.
3. Keep Four-Part GCM as the Default Format
The four-part path should preserve the current behavior:
- parse salt from
parts[1] - parse IV from
parts[2] - parse ciphertext plus tag from
parts[3] - derive the AES key with
PBKDF2WithHmacSHA256, 65,536 iterations, 256-bit key - decrypt with
AES/GCM/NoPadding - use a 128-bit authentication tag
light-encryptor should keep emitting only this format.
4. Cache Derived Keys by KDF Parameters
The existing cache key is based on salt. That is enough while all supported formats use the same PBKDF2 algorithm, iteration count, and key size, but it is not a good foundation for the next upgrade.
Use a cache key that includes the KDF parameters:
<kdf-algorithm>:<iterations>:<key-size>:<salt-hex>
This prevents a future format from accidentally reusing a key derived with old parameters.
The cache key does not need to include the master password as long as the master
password is fixed for the JVM lifetime, which matches the current
AutoAESSaltDecryptor environment-variable model. If a future implementation
allows hot reload of the master password without a JVM restart, the cache must
be cleared on password change or the password identity must become part of the
cache key.
5. Avoid Shared Cipher Instances
Cipher instances are mutable and should not be shared across concurrent
requests. Each decrypt operation should create and initialize a local Cipher.
The cache should store only derived SecretKeySpec values.
6. Keep Error Handling Actionable
The decryptor should separate three categories internally:
- format errors, such as missing fields, empty fields, invalid hex, or an unsupported number of parts
- key derivation errors, such as an unavailable PBKDF2 algorithm
- cryptographic failures, such as an incorrect master password, corrupted ciphertext, or AES-GCM tag failure
All categories should avoid logging sensitive values. GCM tag failures should not be downgraded or ignored; they are the authenticated-encryption signal that the value cannot be trusted.
PBKDF2 Iteration Upgrade
Password and API-Key Hashes
For HashUtil, new hashes should move to a versioned format that records the
algorithm and iteration count. For example:
pbkdf2-sha256:600000:<hex-salt>:<hex-hash>
Validation should support both:
<iterations>:<hex-salt>:<hex-hash>
pbkdf2-sha256:<iterations>:<hex-salt>:<hex-hash>
The legacy three-part password-hash format should continue to validate with its
original algorithm, PBKDF2WithHmacSHA1, because the stored value does not
record the algorithm. New hashes should use PBKDF2WithHmacSHA256 and 600,000
iterations.
Hash comparison should use a constant-time byte-array comparison such as
MessageDigest.isEqual(...). This avoids early-exit comparison behavior when
validating passwords or API keys.
For API keys, request-time cost must be measured before rollout because
ApiKeyHandler can call HashUtil.validatePassword(...) during request
authentication when hashed API keys are enabled.
HashUtil should also expose a helper such as needsRehash(String storedHash)
so callers that can persist credentials can upgrade legacy hashes after a
successful validation.
CRYPT Values
The current four-part CRYPT format does not contain an iteration count. If the
decryptor changes four-part values from 65,536 iterations to 600,000 iterations
without a version marker, existing four-part values will become unreadable.
Therefore, the compatibility fix should keep both three-part CBC and four-part GCM values on the existing 65,536-iteration KDF. A later cost increase for encrypted configuration should introduce a versioned format, for example:
CRYPT2:<kdf-algorithm>:<iterations>:<hex-salt>:<cipher-algorithm>:<hex-iv>:<hex-ciphertext-and-tag>
That future format can use 600,000 iterations, or another measured value, while
the existing CRYPT formats remain readable.
Because configuration decryption usually happens during startup and unique salts can be cached after the first derivation, a higher iteration count is more practical here than it is for per-request API-key validation. A reasonable target for the future versioned format is a measured startup cost, for example 100 ms to 200 ms per unique salt on supported deployment hardware. The exact iteration count should be selected from benchmark data rather than copied directly from password-storage guidance.
Rollout Plan
- Update
light-4jAESSaltDecryptorto read both three-part CBC and four-part GCM formats. - Add regression tests for both formats with known ciphertext fixtures.
- Release the compatible
light-4jdecryptor. - Keep
light-encryptorgenerating only four-part GCM values. - Log a throttled warning whenever the legacy three-part CBC path is used.
- Gradually regenerate old three-part values with the new
light-encryptorwhen configuration files are touched. - Add a scanner or CI check that can report remaining three-part
CRYPTvalues, but do not fail builds until downstream repositories have migrated. - Add a command-line migration utility that can rewrite legacy values after decrypting them with the configured master password.
- Design and implement a separate versioned format before increasing the KDF
iteration count for
CRYPTvalues.
Test Plan
Add focused tests in decryptor:
- decrypts a known legacy
CRYPT:salt:hashvalue to the expected plaintext - decrypts a known current
CRYPT:salt:iv:hashvalue to the expected plaintext - rejects malformed values with fewer than three parts
- rejects malformed values with more than four parts
- rejects empty salt, IV, or ciphertext fields
- verifies
AutoAESSaltDecryptorandManualAESSaltDecryptorinherit the same compatibility behavior
Add config-loading coverage:
- one YAML fixture with a legacy three-part value
- one YAML fixture with a four-part value
- one JSON-style config fixture if the existing config tests cover encrypted values in JSON strings
Add migration checks:
rg -n "CRYPT:[^:[:space:]]+:[^:[:space:]]+(\\s|$)" .
This reports likely legacy three-part values. Each match should be reviewed before replacement because grep is only a locator, not a parser.
Operational Notes
Services should be upgraded to a compatible light-4j decryptor before new
four-part encrypted values are deployed to their configuration. Older runtimes
that only support the three-part CBC format cannot decrypt values generated by
the current light-encryptor.
The compatibility decryptor should log enough detail to identify a format problem, but it must not log plaintext secrets, derived keys, ciphertext bytes, or the master password.
Open Questions
- What package should own the migration utility:
light-encryptor,light-4j, or a small standalone migration artifact? - What benchmark baseline should define the future
CRYPT2KDF cost across container, VM, and local developer environments? - Which
HashUtilcallers can safely persist upgraded hashes afterneedsRehash(...)returns true?