github.com/munnerz/test-infra@v0.0.0-20190108210205-ce3d181dc989/gubernator/third_party/defusedxml-0.4.1-py2.7.egg-info/PKG-INFO (about) 1 Metadata-Version: 1.1 2 Name: defusedxml 3 Version: 0.4.1 4 Summary: XML bomb protection for Python stdlib modules 5 Home-page: https://bitbucket.org/tiran/defusedxml 6 Author: Christian Heimes 7 Author-email: christian@python.org 8 License: PSFL 9 Download-URL: http://pypi.python.org/pypi/defusedxml 10 Description: =================================================== 11 defusedxml -- defusing XML bombs and other exploits 12 =================================================== 13 14 "It's just XML, what could probably go wrong?" 15 16 Christian Heimes <christian@python.org> 17 18 Synopsis 19 ======== 20 21 The results of an attack on a vulnerable XML library can be fairly dramatic. 22 With just a few hundred **Bytes** of XML data an attacker can occupy several 23 **Gigabytes** of memory within **seconds**. An attacker can also keep 24 CPUs busy for a long time with a small to medium size request. Under some 25 circumstances it is even possible to access local files on your 26 server, to circumvent a firewall, or to abuse services to rebound attacks to 27 third parties. 28 29 The attacks use and abuse less common features of XML and its parsers. The 30 majority of developers are unacquainted with features such as processing 31 instructions and entity expansions that XML inherited from SGML. At best 32 they know about ``<!DOCTYPE>`` from experience with HTML but they are not 33 aware that a document type definition (DTD) can generate an HTTP request 34 or load a file from the file system. 35 36 None of the issues is new. They have been known for a long time. Billion 37 laughs was first reported in 2003. Nevertheless some XML libraries and 38 applications are still vulnerable and even heavy users of XML are 39 surprised by these features. It's hard to say whom to blame for the 40 situation. It's too short sighted to shift all blame on XML parsers and 41 XML libraries for using insecure default settings. After all they 42 properly implement XML specifications. Application developers must not rely 43 that a library is always configured for security and potential harmful data 44 by default. 45 46 47 .. contents:: Table of Contents 48 :depth: 2 49 50 51 Attack vectors 52 ============== 53 54 billion laughs / exponential entity expansion 55 --------------------------------------------- 56 57 The `Billion Laughs`_ attack -- also known as exponential entity expansion -- 58 uses multiple levels of nested entities. The original example uses 9 levels 59 of 10 expansions in each level to expand the string ``lol`` to a string of 60 3 * 10 :sup:`9` bytes, hence the name "billion laughs". The resulting string 61 occupies 3 GB (2.79 GiB) of memory; intermediate strings require additional 62 memory. Because most parsers don't cache the intermediate step for every 63 expansion it is repeated over and over again. It increases the CPU load even 64 more. 65 66 An XML document of just a few hundred bytes can disrupt all services on a 67 machine within seconds. 68 69 Example XML:: 70 71 <!DOCTYPE xmlbomb [ 72 <!ENTITY a "1234567890" > 73 <!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;"> 74 <!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;"> 75 <!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;"> 76 ]> 77 <bomb>&d;</bomb> 78 79 80 quadratic blowup entity expansion 81 --------------------------------- 82 83 A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses 84 entity expansion, too. Instead of nested entities it repeats one large entity 85 with a couple of thousand chars over and over again. The attack isn't as 86 efficient as the exponential case but it avoids triggering countermeasures of 87 parsers against heavily nested entities. Some parsers limit the depth and 88 breadth of a single entity but not the total amount of expanded text 89 throughout an entire XML document. 90 91 A medium-sized XML document with a couple of hundred kilobytes can require a 92 couple of hundred MB to several GB of memory. When the attack is combined 93 with some level of nested expansion an attacker is able to achieve a higher 94 ratio of success. 95 96 :: 97 98 <!DOCTYPE bomb [ 99 <!ENTITY a "xxxxxxx... a couple of ten thousand chars"> 100 ]> 101 <bomb>&a;&a;&a;... repeat</bomb> 102 103 104 external entity expansion (remote) 105 ---------------------------------- 106 107 Entity declarations can contain more than just text for replacement. They can 108 also point to external resources by public identifiers or system identifiers. 109 System identifiers are standard URIs. When the URI is a URL (e.g. a 110 ``http://`` locator) some parsers download the resource from the remote 111 location and embed them into the XML document verbatim. 112 113 Simple example of a parsed external entity:: 114 115 <!DOCTYPE external [ 116 <!ENTITY ee SYSTEM "http://www.python.org/some.xml"> 117 ]> 118 <root>ⅇ</root> 119 120 The case of parsed external entities works only for valid XML content. The 121 XML standard also supports unparsed external entities with a 122 ``NData declaration``. 123 124 External entity expansion opens the door to plenty of exploits. An attacker 125 can abuse a vulnerable XML library and application to rebound and forward 126 network requests with the IP address of the server. It highly depends 127 on the parser and the application what kind of exploit is possible. For 128 example: 129 130 * An attacker can circumvent firewalls and gain access to restricted 131 resources as all the requests are made from an internal and trustworthy 132 IP address, not from the outside. 133 * An attacker can abuse a service to attack, spy on or DoS your servers but 134 also third party services. The attack is disguised with the IP address of 135 the server and the attacker is able to utilize the high bandwidth of a big 136 machine. 137 * An attacker can exhaust additional resources on the machine, e.g. with 138 requests to a service that doesn't respond or responds with very large 139 files. 140 * An attacker may gain knowledge, when, how often and from which IP address 141 a XML document is accessed. 142 * An attacker could send mail from inside your network if the URL handler 143 supports ``smtp://`` URIs. 144 145 146 external entity expansion (local file) 147 -------------------------------------- 148 149 External entities with references to local files are a sub-case of external 150 entity expansion. It's listed as an extra attack because it deserves extra 151 attention. Some XML libraries such as lxml disable network access by default 152 but still allow entity expansion with local file access by default. Local 153 files are either referenced with a ``file://`` URL or by a file path (either 154 relative or absolute). 155 156 An attacker may be able to access and download all files that can be read by 157 the application process. This may include critical configuration files, too. 158 159 :: 160 161 <!DOCTYPE external [ 162 <!ENTITY ee SYSTEM "file:///PATH/TO/simple.xml"> 163 ]> 164 <root>ⅇ</root> 165 166 167 DTD retrieval 168 ------------- 169 170 This case is similar to external entity expansion, too. Some XML libraries 171 like Python's xml.dom.pulldom retrieve document type definitions from remote 172 or local locations. Several attack scenarios from the external entity case 173 apply to this issue as well. 174 175 :: 176 177 <?xml version="1.0" encoding="utf-8"?> 178 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 179 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 180 <html> 181 <head/> 182 <body>text</body> 183 </html> 184 185 186 Python XML Libraries 187 ==================== 188 189 .. csv-table:: vulnerabilities and features 190 :header: "kind", "sax", "etree", "minidom", "pulldom", "xmlrpc", "lxml", "genshi" 191 :widths: 24, 7, 8, 8, 7, 8, 8, 8 192 :stub-columns: 0 193 194 "billion laughs", "**True**", "**True**", "**True**", "**True**", "**True**", "False (1)", "False (5)" 195 "quadratic blowup", "**True**", "**True**", "**True**", "**True**", "**True**", "**True**", "False (5)" 196 "external entity expansion (remote)", "**True**", "False (3)", "False (4)", "**True**", "false", "False (1)", "False (5)" 197 "external entity expansion (local file)", "**True**", "False (3)", "False (4)", "**True**", "false", "**True**", "False (5)" 198 "DTD retrieval", "**True**", "False", "False", "**True**", "false", "False (1)", "False" 199 "gzip bomb", "False", "False", "False", "False", "**True**", "**partly** (2)", "False" 200 "xpath support (7)", "False", "False", "False", "False", "False", "**True**", "False" 201 "xsl(t) support (7)", "False", "False", "False", "False", "False", "**True**", "False" 202 "xinclude support (7)", "False", "**True** (6)", "False", "False", "False", "**True** (6)", "**True**" 203 "C library", "expat", "expat", "expat", "expat", "expat", "libxml2", "expat" 204 205 1. Lxml is protected against billion laughs attacks and doesn't do network 206 lookups by default. 207 2. libxml2 and lxml are not directly vulnerable to gzip decompression bombs 208 but they don't protect you against them either. 209 3. xml.etree doesn't expand entities and raises a ParserError when an entity 210 occurs. 211 4. minidom doesn't expand entities and simply returns the unexpanded entity 212 verbatim. 213 5. genshi.input of genshi 0.6 doesn't support entity expansion and raises a 214 ParserError when an entity occurs. 215 6. Library has (limited) XInclude support but requires an additional step to 216 process inclusion. 217 7. These are features but they may introduce exploitable holes, see 218 `Other things to consider`_ 219 220 221 Settings in standard library 222 ---------------------------- 223 224 225 xml.sax.handler Features 226 ........................ 227 228 feature_external_ges (http://xml.org/sax/features/external-general-entities) 229 disables external entity expansion 230 231 feature_external_pes (http://xml.org/sax/features/external-parameter-entities) 232 the option is ignored and doesn't modify any functionality 233 234 DOM xml.dom.xmlbuilder.Options 235 .............................. 236 237 external_parameter_entities 238 ignored 239 240 external_general_entities 241 ignored 242 243 external_dtd_subset 244 ignored 245 246 entities 247 unsure 248 249 250 defusedxml 251 ========== 252 253 The `defusedxml package`_ (`defusedxml on PyPI`_) 254 contains several Python-only workarounds and fixes 255 for denial of service and other vulnerabilities in Python's XML libraries. 256 In order to benefit from the protection you just have to import and use the 257 listed functions / classes from the right defusedxml module instead of the 258 original module. Merely `defusedxml.xmlrpc`_ is implemented as monkey patch. 259 260 Instead of:: 261 262 >>> from xml.etree.ElementTree import parse 263 >>> et = parse(xmlfile) 264 265 alter code to:: 266 267 >>> from defusedxml.ElementTree import parse 268 >>> et = parse(xmlfile) 269 270 Additionally the package has an **untested** function to monkey patch 271 all stdlib modules with ``defusedxml.defuse_stdlib()``. 272 273 All functions and parser classes accept three additional keyword arguments. 274 They return either the same objects as the original functions or compatible 275 subclasses. 276 277 forbid_dtd (default: False) 278 disallow XML with a ``<!DOCTYPE>`` processing instruction and raise a 279 *DTDForbidden* exception when a DTD processing instruction is found. 280 281 forbid_entities (default: True) 282 disallow XML with ``<!ENTITY>`` declarations inside the DTD and raise an 283 *EntitiesForbidden* exception when an entity is declared. 284 285 forbid_external (default: True) 286 disallow any access to remote or local resources in external entities 287 or DTD and raising an *ExternalReferenceForbidden* exception when a DTD 288 or entity references an external resource. 289 290 291 defusedxml (package) 292 -------------------- 293 294 DefusedXmlException, DTDForbidden, EntitiesForbidden, 295 ExternalReferenceForbidden, NotSupportedError 296 297 defuse_stdlib() (*experimental*) 298 299 300 defusedxml.cElementTree 301 ----------------------- 302 303 parse(), iterparse(), fromstring(), XMLParser 304 305 306 defusedxml.ElementTree 307 ----------------------- 308 309 parse(), iterparse(), fromstring(), XMLParser 310 311 312 defusedxml.expatreader 313 ---------------------- 314 315 create_parser(), DefusedExpatParser 316 317 318 defusedxml.sax 319 -------------- 320 321 parse(), parseString(), create_parser() 322 323 324 defusedxml.expatbuilder 325 ----------------------- 326 327 parse(), parseString(), DefusedExpatBuilder, DefusedExpatBuilderNS 328 329 330 defusedxml.minidom 331 ------------------ 332 333 parse(), parseString() 334 335 336 defusedxml.pulldom 337 ------------------ 338 339 parse(), parseString() 340 341 342 defusedxml.xmlrpc 343 ----------------- 344 345 The fix is implemented as monkey patch for the stdlib's xmlrpc package (3.x) 346 or xmlrpclib module (2.x). The function `monkey_patch()` enables the fixes, 347 `unmonkey_patch()` removes the patch and puts the code in its former state. 348 349 The monkey patch protects against XML related attacks as well as 350 decompression bombs and excessively large requests or responses. The default 351 setting is 30 MB for requests, responses and gzip decompression. You can 352 modify the default by changing the module variable `MAX_DATA`. A value of 353 `-1` disables the limit. 354 355 356 defusedxml.lxml 357 --------------- 358 359 The module acts as an *example* how you could protect code that uses 360 lxml.etree. It implements a custom Element class that filters out 361 Entity instances, a custom parser factory and a thread local storage for 362 parser instances. It also has a check_docinfo() function which inspects 363 a tree for internal or external DTDs and entity declarations. In order to 364 check for entities lxml > 3.0 is required. 365 366 parse(), fromstring() 367 RestrictedElement, GlobalParserTLS, getDefaultParser(), check_docinfo() 368 369 370 defusedexpat 371 ============ 372 373 The `defusedexpat package`_ (`defusedexpat on PyPI`_) 374 comes with binary extensions and a 375 `modified expat`_ libary instead of the standard `expat parser`_. It's 376 basically a stand-alone version of the patches for Python's standard 377 library C extensions. 378 379 Modifications in expat 380 ---------------------- 381 382 new definitions:: 383 384 XML_BOMB_PROTECTION 385 XML_DEFAULT_MAX_ENTITY_INDIRECTIONS 386 XML_DEFAULT_MAX_ENTITY_EXPANSIONS 387 XML_DEFAULT_RESET_DTD 388 389 new XML_FeatureEnum members:: 390 391 XML_FEATURE_MAX_ENTITY_INDIRECTIONS 392 XML_FEATURE_MAX_ENTITY_EXPANSIONS 393 XML_FEATURE_IGNORE_DTD 394 395 new XML_Error members:: 396 397 XML_ERROR_ENTITY_INDIRECTIONS 398 XML_ERROR_ENTITY_EXPANSION 399 400 new API functions:: 401 402 int XML_GetFeature(XML_Parser parser, 403 enum XML_FeatureEnum feature, 404 long *value); 405 int XML_SetFeature(XML_Parser parser, 406 enum XML_FeatureEnum feature, 407 long value); 408 int XML_GetFeatureDefault(enum XML_FeatureEnum feature, 409 long *value); 410 int XML_SetFeatureDefault(enum XML_FeatureEnum feature, 411 long value); 412 413 XML_FEATURE_MAX_ENTITY_INDIRECTIONS 414 Limit the amount of indirections that are allowed to occur during the 415 expansion of a nested entity. A counter starts when an entity reference 416 is encountered. It resets after the entity is fully expanded. The limit 417 protects the parser against exponential entity expansion attacks (aka 418 billion laughs attack). When the limit is exceeded the parser stops and 419 fails with `XML_ERROR_ENTITY_INDIRECTIONS`. 420 A value of 0 disables the protection. 421 422 Supported range 423 0 .. UINT_MAX 424 Default 425 40 426 427 XML_FEATURE_MAX_ENTITY_EXPANSIONS 428 Limit the total length of all entity expansions throughout the entire 429 document. The lengths of all entities are accumulated in a parser variable. 430 The setting protects against quadratic blowup attacks (lots of expansions 431 of a large entity declaration). When the sum of all entities exceeds 432 the limit, the parser stops and fails with `XML_ERROR_ENTITY_EXPANSION`. 433 A value of 0 disables the protection. 434 435 Supported range 436 0 .. UINT_MAX 437 Default 438 8 MiB 439 440 XML_FEATURE_RESET_DTD 441 Reset all DTD information after the <!DOCTYPE> block has been parsed. When 442 the flag is set (default: false) all DTD information after the 443 endDoctypeDeclHandler has been called. The flag can be set inside the 444 endDoctypeDeclHandler. Without DTD information any entity reference in 445 the document body leads to `XML_ERROR_UNDEFINED_ENTITY`. 446 447 Supported range 448 0, 1 449 Default 450 0 451 452 453 How to avoid XML vulnerabilities 454 ================================ 455 456 Best practices 457 -------------- 458 459 * Don't allow DTDs 460 * Don't expand entities 461 * Don't resolve externals 462 * Limit parse depth 463 * Limit total input size 464 * Limit parse time 465 * Favor a SAX or iterparse-like parser for potential large data 466 * Validate and properly quote arguments to XSL transformations and 467 XPath queries 468 * Don't use XPath expression from untrusted sources 469 * Don't apply XSL transformations that come untrusted sources 470 471 (based on Brad Hill's `Attacking XML Security`_) 472 473 474 Other things to consider 475 ======================== 476 477 XML, XML parsers and processing libraries have more features and possible 478 issue that could lead to DoS vulnerabilities or security exploits in 479 applications. I have compiled an incomplete list of theoretical issues that 480 need further research and more attention. The list is deliberately pessimistic 481 and a bit paranoid, too. It contains things that might go wrong under daffy 482 circumstances. 483 484 485 attribute blowup / hash collision attack 486 ---------------------------------------- 487 488 XML parsers may use an algorithm with quadratic runtime O(n :sup:`2`) to 489 handle attributes and namespaces. If it uses hash tables (dictionaries) to 490 store attributes and namespaces the implementation may be vulnerable to 491 hash collision attacks, thus reducing the performance to O(n :sup:`2`) again. 492 In either case an attacker is able to forge a denial of service attack with 493 an XML document that contains thousands upon thousands of attributes in 494 a single node. 495 496 I haven't researched yet if expat, pyexpat or libxml2 are vulnerable. 497 498 499 decompression bomb 500 ------------------ 501 502 The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries 503 that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed 504 files. For an attacker it can reduce the amount of transmitted data by three 505 magnitudes or more. Gzip is able to compress 1 GiB zeros to roughly 1 MB, 506 lzma is even better:: 507 508 $ dd if=/dev/zero bs=1M count=1024 | gzip > zeros.gz 509 $ dd if=/dev/zero bs=1M count=1024 | lzma -z > zeros.xy 510 $ ls -sh zeros.* 511 1020K zeros.gz 512 148K zeros.xy 513 514 None of Python's standard XML libraries decompress streams except for 515 ``xmlrpclib``. The module is vulnerable <http://bugs.python.org/issue16043> 516 to decompression bombs. 517 518 lxml can load and process compressed data through libxml2 transparently. 519 libxml2 can handle even very large blobs of compressed data efficiently 520 without using too much memory. But it doesn't protect applications from 521 decompression bombs. A carefully written SAX or iterparse-like approach can 522 be safe. 523 524 525 Processing Instruction 526 ---------------------- 527 528 `PI`_'s like:: 529 530 <?xml-stylesheet type="text/xsl" href="style.xsl"?> 531 532 may impose more threats for XML processing. It depends if and how a 533 processor handles processing instructions. The issue of URL retrieval with 534 network or local file access apply to processing instructions, too. 535 536 537 Other DTD features 538 ------------------ 539 540 `DTD`_ has more features like ``<!NOTATION>``. I haven't researched how 541 these features may be a security threat. 542 543 544 XPath 545 ----- 546 547 XPath statements may introduce DoS vulnerabilities. Code should never execute 548 queries from untrusted sources. An attacker may also be able to create a XML 549 document that makes certain XPath queries costly or resource hungry. 550 551 552 XPath injection attacks 553 ----------------------- 554 555 XPath injeciton attacks pretty much work like SQL injection attacks. 556 Arguments to XPath queries must be quoted and validated properly, especially 557 when they are taken from the user. The page `Avoid the dangers of XPath injection`_ 558 list some ramifications of XPath injections. 559 560 Python's standard library doesn't have XPath support. Lxml supports 561 parameterized XPath queries which does proper quoting. You just have to use 562 its xpath() method correctly:: 563 564 # DON'T 565 >>> tree.xpath("/tag[@id='%s']" % value) 566 567 # instead do 568 >>> tree.xpath("/tag[@id=$tagid]", tagid=name) 569 570 571 XInclude 572 -------- 573 574 `XML Inclusion`_ is another way to load and include external files:: 575 576 <root xmlns:xi="http://www.w3.org/2001/XInclude"> 577 <xi:include href="filename.txt" parse="text" /> 578 </root> 579 580 This feature should be disabled when XML files from an untrusted source are 581 processed. Some Python XML libraries and libxml2 support XInclude but don't 582 have an option to sandbox inclusion and limit it to allowed directories. 583 584 585 XMLSchema location 586 ------------------ 587 588 A validating XML parser may download schema files from the information in a 589 ``xsi:schemaLocation`` attribute. 590 591 :: 592 593 <ead xmlns="urn:isbn:1-931666-22-9" 594 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 595 xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"> 596 </ead> 597 598 599 XSL Transformation 600 ------------------ 601 602 You should keep in mind that XSLT is a Turing complete language. Never 603 process XSLT code from unknown or untrusted source! XSLT processors may 604 allow you to interact with external resources in ways you can't even imagine. 605 Some processors even support extensions that allow read/write access to file 606 system, access to JRE objects or scripting with Jython. 607 608 Example from `Attacking XML Security`_ for Xalan-J:: 609 610 <xsl:stylesheet version="1.0" 611 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 612 xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime" 613 xmlns:ob="http://xml.apache.org/xalan/java/java.lang.Object" 614 exclude-result-prefixes= "rt ob"> 615 <xsl:template match="/"> 616 <xsl:variable name="runtimeObject" select="rt:getRuntime()"/> 617 <xsl:variable name="command" 618 select="rt:exec($runtimeObject, 'c:\Windows\system32\cmd.exe')"/> 619 <xsl:variable name="commandAsString" select="ob:toString($command)"/> 620 <xsl:value-of select="$commandAsString"/> 621 </xsl:template> 622 </xsl:stylesheet> 623 624 625 Related CVEs 626 ============ 627 628 CVE-2013-1664 629 Unrestricted entity expansion induces DoS vulnerabilities in Python XML 630 libraries (XML bomb) 631 632 CVE-2013-1665 633 External entity expansion in Python XML libraries inflicts potential 634 security flaws and DoS vulnerabilities 635 636 637 Other languages / frameworks 638 ============================= 639 640 Several other programming languages and frameworks are vulnerable as well. A 641 couple of them are affected by the fact that libxml2 up to 2.9.0 has no 642 protection against quadratic blowup attacks. Most of them have potential 643 dangerous default settings for entity expansion and external entities, too. 644 645 Perl 646 ---- 647 648 Perl's XML::Simple is vulnerable to quadratic entity expansion and external 649 entity expansion (both local and remote). 650 651 652 Ruby 653 ---- 654 655 Ruby's REXML document parser is vulnerable to entity expansion attacks 656 (both quadratic and exponential) but it doesn't do external entity 657 expansion by default. In order to counteract entity expansion you have to 658 disable the feature:: 659 660 REXML::Document.entity_expansion_limit = 0 661 662 libxml-ruby and hpricot don't expand entities in their default configuration. 663 664 665 PHP 666 --- 667 668 PHP's SimpleXML API is vulnerable to quadratic entity expansion and loads 669 entites from local and remote resources. The option ``LIBXML_NONET`` disables 670 network access but still allows local file access. ``LIBXML_NOENT`` seems to 671 have no effect on entity expansion in PHP 5.4.6. 672 673 674 C# / .NET / Mono 675 ---------------- 676 677 Information in `XML DoS and Defenses (MSDN)`_ suggest that .NET is 678 vulnerable with its default settings. The article contains code snippets 679 how to create a secure XML reader:: 680 681 XmlReaderSettings settings = new XmlReaderSettings(); 682 settings.ProhibitDtd = false; 683 settings.MaxCharactersFromEntities = 1024; 684 settings.XmlResolver = null; 685 XmlReader reader = XmlReader.Create(stream, settings); 686 687 688 Java 689 ---- 690 691 Untested. The documentation of Xerces and its `Xerces SecurityMananger`_ 692 sounds like Xerces is also vulnerable to billion laugh attacks with its 693 default settings. It also does entity resolving when an 694 ``org.xml.sax.EntityResolver`` is configured. I'm not yet sure about the 695 default setting here. 696 697 Java specialists suggest to have a custom builder factory:: 698 699 DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); 700 builderFactory.setXIncludeAware(False); 701 builderFactory.setExpandEntityReferences(False); 702 builderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, True); 703 # either 704 builderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", True); 705 # or if you need DTDs 706 builderFactory.setFeature("http://xml.org/sax/features/external-general-entities", False); 707 builderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", False); 708 builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", False); 709 builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", False); 710 711 712 TODO 713 ==== 714 715 * DOM: Use xml.dom.xmlbuilder options for entity handling 716 * SAX: take feature_external_ges and feature_external_pes (?) into account 717 * test experimental monkey patching of stdlib modules 718 * improve documentation 719 720 721 License 722 ======= 723 724 Copyright (c) 2013 by Christian Heimes <christian@python.org> 725 726 Licensed to PSF under a Contributor Agreement. 727 728 See http://www.python.org/psf/license for licensing details. 729 730 731 Acknowledgements 732 ================ 733 734 Brett Cannon (Python Core developer) 735 review and code cleanup 736 737 Antoine Pitrou (Python Core developer) 738 code review 739 740 Aaron Patterson, Ben Murphy and Michael Koziarski (Ruby community) 741 Many thanks to Aaron, Ben and Michael from the Ruby community for their 742 report and assistance. 743 744 Thierry Carrez (OpenStack) 745 Many thanks to Thierry for his report to the Python Security Response 746 Team on behalf of the OpenStack security team. 747 748 Carl Meyer (Django) 749 Many thanks to Carl for his report to PSRT on behalf of the Django security 750 team. 751 752 Daniel Veillard (libxml2) 753 Many thanks to Daniel for his insight and assistance with libxml2. 754 755 semantics GmbH (http://www.semantics.de/) 756 Many thanks to my employer semantics for letting me work on the issue 757 during working hours as part of semantics's open source initiative. 758 759 760 References 761 ========== 762 763 * `XML DoS and Defenses (MSDN)`_ 764 * `Billion Laughs`_ on Wikipedia 765 * `ZIP bomb`_ on Wikipedia 766 * `Configure SAX parsers for secure processing`_ 767 * `Testing for XML Injection`_ 768 769 .. _defusedxml package: https://bitbucket.org/tiran/defusedxml 770 .. _defusedxml on PyPI: https://pypi.python.org/pypi/defusedxml 771 .. _defusedexpat package: https://bitbucket.org/tiran/defusedexpat 772 .. _defusedexpat on PyPI: https://pypi.python.org/pypi/defusedexpat 773 .. _modified expat: https://bitbucket.org/tiran/expat 774 .. _expat parser: http://expat.sourceforge.net/ 775 .. _Attacking XML Security: https://www.isecpartners.com/media/12976/iSEC-HILL-Attacking-XML-Security-bh07.pdf 776 .. _Billion Laughs: http://en.wikipedia.org/wiki/Billion_laughs 777 .. _XML DoS and Defenses (MSDN): http://msdn.microsoft.com/en-us/magazine/ee335713.aspx 778 .. _ZIP bomb: http://en.wikipedia.org/wiki/Zip_bomb 779 .. _DTD: http://en.wikipedia.org/wiki/Document_Type_Definition 780 .. _PI: https://en.wikipedia.org/wiki/Processing_Instruction 781 .. _Avoid the dangers of XPath injection: http://www.ibm.com/developerworks/xml/library/x-xpathinjection/index.html 782 .. _Configure SAX parsers for secure processing: http://www.ibm.com/developerworks/xml/library/x-tipcfsx/index.html 783 .. _Testing for XML Injection: https://www.owasp.org/index.php/Testing_for_XML_Injection_(OWASP-DV-008) 784 .. _Xerces SecurityMananger: http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html 785 .. _XML Inclusion: http://www.w3.org/TR/xinclude/#include_element 786 787 Changelog 788 ========= 789 790 defusedxml 0.4.1 791 ---------------- 792 793 *Release date: 28-Mar-2013* 794 795 - Add more demo exploits, e.g. python_external.py and Xalan XSLT demos. 796 - Improved documentation. 797 798 799 defusedxml 0.4 800 -------------- 801 802 *Release date: 25-Feb-2013* 803 804 - As per http://seclists.org/oss-sec/2013/q1/340 please REJECT 805 CVE-2013-0278, CVE-2013-0279 and CVE-2013-0280 and use CVE-2013-1664, 806 CVE-2013-1665 for OpenStack/etc. 807 - Add missing parser_list argument to sax.make_parser(). The argument is 808 ignored, though. (thanks to Florian Apolloner) 809 - Add demo exploit for external entity attack on Python's SAX parser, XML-RPC 810 and WebDAV. 811 812 813 defusedxml 0.3 814 -------------- 815 816 *Release date: 19-Feb-2013* 817 818 - Improve documentation 819 820 821 defusedxml 0.2 822 -------------- 823 824 *Release date: 15-Feb-2013* 825 826 - Rename ExternalEntitiesForbidden to ExternalReferenceForbidden 827 - Rename defusedxml.lxml.check_dtd() to check_docinfo() 828 - Unify argument names in callbacks 829 - Add arguments and formatted representation to exceptions 830 - Add forbid_external argument to all functions and classs 831 - More tests 832 - LOTS of documentation 833 - Add example code for other languages (Ruby, Perl, PHP) and parsers (Genshi) 834 - Add protection against XML and gzip attacks to xmlrpclib 835 836 defusedxml 0.1 837 -------------- 838 839 *Release date: 08-Feb-2013* 840 841 - Initial and internal release for PSRT review 842 843 Keywords: xml bomb DoS 844 Platform: all 845 Classifier: Development Status :: 5 - Production/Stable 846 Classifier: Intended Audience :: Developers 847 Classifier: License :: OSI Approved :: Python Software Foundation License 848 Classifier: Natural Language :: English 849 Classifier: Programming Language :: Python 850 Classifier: Programming Language :: Python :: 2 851 Classifier: Programming Language :: Python :: 2.6 852 Classifier: Programming Language :: Python :: 2.7 853 Classifier: Programming Language :: Python :: 3 854 Classifier: Programming Language :: Python :: 3.1 855 Classifier: Programming Language :: Python :: 3.2 856 Classifier: Programming Language :: Python :: 3.3 857 Classifier: Topic :: Text Processing :: Markup :: XML