Web Race Conditions – Success and Failure – a Keycloak Case Study
In today’s connected world, many organizations’ “keys to the kingdom” are held in identity and access management (IAM) solutions; these play a crucial role in protecting organizations’ assets.
In this post, we delve into the world of Keycloak, a popular open-source IAM solution.
As part of our work at CyberArk Labs, we research open-source projects and look for security issues so we can share our findings with the open-source and security communities.
Our research focuses on dissecting Keycloak’s security mechanisms. We will delve into a straightforward yet powerful technique for fuzzing LDAP servers. Then, we will deep dive into web race conditions, showcasing two distinct scenarios: (1) one where concurrency appears to be effectively managed and (2) another scenario where we discovered a race condition within the application.
But that’s not all; brace yourself for the finale: presenting and analyzing the root cause of a security issue we found (CVE-2024-1722).
The Target System: Keycloak
Keycloak, the open-source identity and access management solution, simplifies securing your applications by seamlessly managing user authentication, authorization and single sign-on (SSO).
According to our Shodan search, there are ~27,000 internet-facing Keycloak systems; additionally, Keycloak boasts an extensive library of extensions, including specialized support for the French administration identity provider, France-connect.
Keycloak is the upstream of Red Hat Single Sign-On (RH-SSO) and is actively maintained by Red Hat.
So, with our target set, let’s start researching.
Our Keycloak Security Research
We start at the Keycloak website and documentation. While reading the documentation, we focus on identifying the system’s interfaces, looking for potential attack vectors, and defining a scope for our security review.
Figure 1: Keycloak interfacese
As can be seen in Figure 1, Keycloak has many interfaces.
Our scope included the following areas:
- Web UI + web API – authorization tests
- LDAP integrations
- User self-registration flow
- Client (applications) registration
- Integrations with identity providers
- OpenID connect flow
Although the original scope is vast, in this post, we will only dive into areas we think are worth detailing.
The first Keycloak feature we will look at is LDAP integrations.
LDAP
This functionality enables Keycloak to link with a pre-existing LDAP server; this allows users to authenticate using their LDAP credentials, such as Active Directory, and gain access to an application that trusts Keycloak.
LDAP Injections
When encountering an LDAP interface, the first attack method that comes to mind is LDAP injection; if susceptible, this could enable an attacker to gain unauthorized access to the LDAP directory, potentially bypassing access restrictions or even viewing or modifying usernames and passwords.
We tried to use some known LDAP (injection) payloads; during that process, we inspected the traffic flowing between our Keycloak and LDAP servers, and based on our inspections, none of the payloads seemed to “escape” from the LDAP query; hence the server is not vulnerable to a common LDAP injection.
LDAP Fuzzing
While in the Keycloak context, LDAP injections are straightforward (attacks are performed directly on the user-facing login page), and since LDAP servers are considered closed/internal systems, we speculated that the data from the LDAP server might be overlooked.
LDAP Server as an Attack Vector
Assuming an internal user can update some fields in their LDAP profile (maybe via an internal portal or by other means), and when taking into account previous Keycloak XSS CVEs, there may be some payloads (or special characters) that are acceptable in the LDAP context but, for some reason, “break” the Keycloak LDAP parser – and may cause harm to the system.
Using this attack vector, we are primarily seeking to determine the following:
- Is it possible for us to disrupt other users, potentially denying their login attempts?
- Can we inject a payload that will be executed on the admin web interface (XSS) with their permission?
- While less likely, could we cause the system to crash, lock or uncover a memory issue?
Now that we’ve established our theory, let’s go bug hunting!
Fuzzing LDAP – Methodology
To execute our tests, we developed the following process: our lab environment included setting up an OpenLDAP docker container and configuring a Keycloak realm to use our OpenLDAP server; furthermore, we used LDAP Data Interchange Format (LDIF) files and the ldapadd command.
Our fuzzing process is based on the following steps:
- Create an LDAP user template.
- Create a script to produce a single LDAP-user file.
- Execute the script, creating LDIF files.
- Populate the users from the LDIF files to our OpenLDAP server.
- Note: Here, we used the ldapadd
Our verification process included the following steps:
- Configure the Keycloak to use the relevant OpenLDAP server.
- Inspect the “imported” users in the Keycloak UI.
- Pray for profit.
This method allowed us to inject special characters (which the LDAP server accepts) and create many users faster than the manual process (see Figure 2 below).
Figure 2: Snippet of the users’ list while fuzzing the first-name parameter
We tested a long list of payloads, including special characters and some known XSS payloads; not all payloads are accepted by the OpenLDAP Server, so these are filtered “natively.”
Fuzzing LDAP – Issue Found and Further Research
Using the above fuzzing process, we discovered the following issue:
If the same email address is used for two different users (this is acceptable in OpenLDAP and Active Directory) while listing the realm users, the Keycloak admin UI and API malfunction (see Figure 3 below).
Figure 3: LDAP email issue
As shown in Figure 3 above, while listing the modified user, the UI displays an error and does not present the user’s list; nevertheless, other users (except the modified user) are unaffected by this issue and can log in normally.
We reported this issue to the vendor, who informed us that it was reported before we did.
Even though our fuzzing session did not discover a security issue, this area can be further explored on other systems (supporting LDAP protocol).
Web Race Conditions – Success and Failure
While considering our potential attack vectors, one area that caught our attention is User Self-Registration; when enabled, the login page displays a registration link so that a user can create a Keycloak account.
One topic that came up here is race conditions and the question of whether the system handles concurrency-related problems.
While looking for techniques that could be useful for our research, we encountered the great James-kettle (portswigger) blogpost. The author’s article introduces “new classes of race conditions that go far beyond the limit-overrun exploits.”
A recurring theme in the article is “with race conditions, everything is multi-step.” In essence, this means that “Every HTTP request may transition an application through multiple, hidden states; if you time it right, you can abuse these sub-states for unintended transitions, break business logic and achieve high–impact exploits.”
With that in mind, let’s try the learned techniques on our target system.
Everything is Multi-step (Well, Sometimes)
While inspecting the Keycloak database, we noticed that the User Entity table is separated from the Required Action table (see Figure 4 below).
In the Keycloak context, Required Action means what actions a user needs to do/set upon the next login. An example entry can be a verify email, so the user must verify his email upon the next login.
Figure 4: Keycloak user required_action table
Here, our attack theory was: “Creating an internal user and setting his email-verified flag (database row) might be a multi-step process,” and if so, by using race-condition techniques, we will be able to log in without email verification (and by that gain access to an application trusting Keycloak using an arbitrary email address).
Racing Self-Registration
This test required us to create some burp macros, which allowed us to issue login requests at the same time as user registration requests.
According to our tests, the registration process is not vulnerable to this race condition; let’s find out why in the next section.
Digging Deeper – Race (Condition) Avoidance
While this attack was unsuccessful for us, as researchers, we thought this was an excellent opportunity to inspect the code and see an exemplary implementation to avoid race conditions.
Let’s dive into the code responsible for creating a user in the system.
Figure 5: Add user code snippet (modified for visualization)
In Figure 5, in the addUser function, our theoretical race window starts around line 113 (new user is created) and ends at line 129 (after this line, the required_action seems to be affected), so our attack might be successful at this time window.
Next, let’s look at the model of the user-entity.
Figure 6: Keycloak user-entity model
In Figure 6, we found our first clue: the model uses the (Jakarta.persistence) Entity annotation (line 60).
Jakarta.persistence is a Java specification that enables developers to manage relational data in Java Enterprise applications.
While glancing at the Jakarta.persistence documentation, we see that this specification can utilize transactions.
Database Transactions
One of the mitigation strategies for concurrency issues (race conditions) in the context of web applications is using database transactions.
Keycloak uses Jakarta.persistence and Hibernate as its object relational mapper (ORM) provider.
With that in mind, let’s inspect the database operations while creating the user and see if they indeed use transactions while updating the database. To do that, we enabled logging on to the ORM (org.hibernate:debug).
Using a debugger, setting a breakpoint at line 129 (just before the required_action table is updated), at this timeframe (the race window – between lines 113 to 129), we should see the new user added to the database without the email-verification requirement.
Figure 7: Inspecting hibernate log while debugging user creation
The figure above shows the Hibernate logs indicating a new user-entity and an SQL Insert statement.
Querying the database at this moment reveals to us that the new user has not yet been created in the database.
After a few debugging trials, we found other hints; when releasing the breakpoint (and letting the user creation flow finish executing), we encountered the following Hibernate log:
Figure 8: Hibernate log indicating the use of transaction
As seen in Figure 8, the Hibernate log indicates the use of transactions, so we conclude that the user creation flow uses transactions.
While considering the database state at runtime and the conclusion above, this explanation is acceptable; nevertheless, we were looking for more ‘solid’ proof.
After a few more debugging sessions and adding tracing to our database, we finally found a sufficient answer.
Figure 9: Hibernate log and database (H2) trace
As seen in the above image (Figure 9), the Hibernate log indicates the use of transaction (auto-commit disabled), and the database (H2) trace indicates a COMMIT statement.
This means that the database state is updated in “one shot” (transaction), so from the db (and users) perspective, the user does not exist or exists with the email-verification-required (set).
Therefore, this code is not vulnerable to our theoretical race condition.
Limit Overrun Race Conditions
Another Keycloak feature that caught our attention is Client-Registrations, specifically Dynamic-Client-Registration; when enabled, clients can register themselves through the Keycloak client registration service; this requires an admin-created access-token (or other configuration).
Here, we speculated, can we override the initial-access-token count limit?
We will look into it in the next section.
Keycloak Initial-Access-Token – Limit Overrun Race Condition
Keycloak client-registration initial-access-token (IAT) count limit (set by admin on token creation) can be bypassed when used multiple times in parallel (race condition).
Exploiting this issue is trivial; use a provided initial-access-token in multiple requests and issue the requests in parallel.
Keycloak IAT Race Condition – Demo
Demo video 1: Keycloak IAT race condition demo
We reported this issue to the vendor, and from their perspective, the issue is evaluated as low severity, classifying it as a weakness so that no CVE will be assigned.
As seen by the above sections, on the same system, even if some parts of the application are considered for concurrency, there may be overlooked areas, such as web API in our case, where race conditions are still feasible.
CVE-2024-1722 – A Rare but Painful Denial-of-Service (DoS)
Keycloak can be configured to enable (User) self-registration (see Figure 10 below); as an additional security feature, the system can also be configured to verify the user’s email pre-login.
Figure 10: Keycloak configured to enable self-registration
While testing this feature, we found the following issue:
In any realm set with “User self-registration”, a user registered using a username in an email format and the username is not the same as the email, can be denied from logging in (“locked out”) using their username; nevertheless, the victim can log in using their (registered) email.
Impact:
A successful exploit of the issue will prevent the specific user from logging in to their account using their username; see the demo in the next section.
Potential Attack Scenario – UserMail Conflict
Assumption 1: A realm is configured for user-registration.
Note: The “verify email” and “forgot password” settings can be activated.
Assumption 2: The attacker obtained the victim’s username.
Let’s dive into the scenario.
Figure 11: CVE-2024-1722 – potential attack scenario (icons source: Flaticon)
1. A user registers to the realm using a username in an email format. (step (1) in Figure 11).
1. 1. We will refer to this user as the victim (Alice in Figure 11 above).
1.2. The user can verify her email using the received email; this can be seen in the demo below (not in the figure)
2. The user (Alice) normally logs in using the username and password (step (2) in Figure 11); at this point, the username and email can be the same.
3. Alice updates her email (to her parent Corp in this case – step (3) in Figure 11); now, the username and email are not the same address.
4. Alice logs in using her username (old email) and password (step (4) in Figure 11) (user credentials can be saved in the user’s browser or password manager).
4.1 Note: At this point, the database lookup matched the username (not the email).
From the user’s perspective, the login is the same.
Time goes by…
Alice logs in using her username and, so far, all is good…
Time goes by…
An attacker obtained the victim’s username (her previous email address).
5. The attacker registers a new user (step (5) in Figure 11).
5.1. The attacker registers and sets his email address as Alice’s old email (now her username); he can use any username.
5.2. The attacker does not need access to the email, even when the “verify email” option is set (seen in the demo below).
6. At this point, the victim cannot access her account using her username (step (6) in Figure 11).
6.1. The victim cannot log in using her username, even after a password reset.
6.2. The victim can log in using her (registered) email.
6.3. In the case where the “forgot password” flow is set, the user can “log in” only once using that flow (see demo).
Alice can no longer log in using her username (old email).
Demo video 2: Keycloak email conflict DOS
We reported this issue to the vendor, who assigned it the following CVE number: CVE-2024-1722.
While feasibility might be rare, the impact on the affected user requires an admin to revert.
CVE-2024-1722 Root Cause Analysis
So, we found a problem; let’s find the root cause.
When inspecting the login-flow logic, we can see our suspected source.
Figure 12: Login flow user lookup code snippet
Figure 13: Login flow user lookup code snippet
As shown in the figures above (Figures 12 and 13), upon login, the user-entity is looked up in the database, first searching by email and then by username.
This process, in combination with the ability to create a user where the email exists in the database in the username field, causes the issue.
Key Insights from the Keycloak Case Study
Even though our LDAP fuzzing session discovered a non-security issue, we believe this area can be further explored on other systems.
When auditing a system, even if some parts of an application are considered for concurrency, there may be overlooked areas where race conditions are still feasible.
When inspecting a system in a white-box setup, some hints can tell us concurrency has been considered; additionally, debugging can be helpful when searching for these issues.
Although rare, the username-email conflict can cause a “user lockout” scenario; we suggest looking for this pattern in other user-management systems.
Wrapping up
In this post, we delved into the world of the widely used Keycloak system; our research journey led us through exploring attack vectors and uncovering vulnerabilities from LDAP injections and fuzzing to web race conditions.
Moreover, we shed light on the security issue we found, CVE-2024-1722, an impactful denial-of-service vulnerability.
Whether you’re a developer, system administrator or security enthusiast, understanding these intricacies is vital for safeguarding your systems against potential threats.
Maor Abutbul is a vulnerability researcher at CyberArk Labs.