github.com/munnerz/test-infra@v0.0.0-20190108210205-ce3d181dc989/gubernator/third_party/defusedxml-0.4.1-py2.7.egg-info/PKG-INFO (about)

     1  Metadata-Version: 1.1
     2  Name: defusedxml
     3  Version: 0.4.1
     4  Summary: XML bomb protection for Python stdlib modules
     5  Home-page: https://bitbucket.org/tiran/defusedxml
     6  Author: Christian Heimes
     7  Author-email: christian@python.org
     8  License: PSFL
     9  Download-URL: http://pypi.python.org/pypi/defusedxml
    10  Description: ===================================================
    11          defusedxml -- defusing XML bombs and other exploits
    12          ===================================================
    13          
    14              "It's just XML, what could probably go wrong?"
    15          
    16          Christian Heimes <christian@python.org>
    17          
    18          Synopsis
    19          ========
    20          
    21          The results of an attack on a vulnerable XML library can be fairly dramatic.
    22          With just a few hundred **Bytes** of XML data an attacker can occupy several
    23          **Gigabytes** of memory within **seconds**. An attacker can also keep
    24          CPUs busy for a long time with a small to medium size request. Under some
    25          circumstances it is even possible to access local files on your
    26          server, to circumvent a firewall, or to abuse services to rebound attacks to
    27          third parties.
    28          
    29          The attacks use and abuse less common features of XML and its parsers. The
    30          majority of developers are unacquainted with features such as processing
    31          instructions and entity expansions that XML inherited from SGML. At best
    32          they know about ``<!DOCTYPE>`` from experience with HTML but they are not
    33          aware that a document type definition (DTD) can generate an HTTP request
    34          or load a file from the file system.
    35          
    36          None of the issues is new. They have been known for a long time. Billion
    37          laughs was first reported in 2003. Nevertheless some XML libraries and
    38          applications are still vulnerable and even heavy users of XML are
    39          surprised by these features. It's hard to say whom to blame for the
    40          situation. It's too short sighted to shift all blame on XML parsers and
    41          XML libraries for using insecure default settings. After all they
    42          properly implement XML specifications. Application developers must not rely
    43          that a library is always configured for security and potential harmful data
    44          by default.
    45          
    46          
    47          .. contents:: Table of Contents
    48             :depth: 2
    49          
    50          
    51          Attack vectors
    52          ==============
    53          
    54          billion laughs / exponential entity expansion
    55          ---------------------------------------------
    56          
    57          The `Billion Laughs`_ attack -- also known as exponential entity expansion --
    58          uses multiple levels of nested entities. The original example uses 9 levels
    59          of 10 expansions in each level to expand the string ``lol`` to a string of
    60          3 * 10 :sup:`9` bytes, hence the name "billion laughs". The resulting string
    61          occupies 3 GB (2.79 GiB) of memory; intermediate strings require additional
    62          memory. Because most parsers don't cache the intermediate step for every
    63          expansion it is repeated over and over again. It increases the CPU load even
    64          more.
    65          
    66          An XML document of just a few hundred bytes can disrupt all services on a
    67          machine within seconds.
    68          
    69          Example XML::
    70          
    71              <!DOCTYPE xmlbomb [
    72              <!ENTITY a "1234567890" >
    73              <!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;">
    74              <!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;">
    75              <!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;">
    76              ]>
    77              <bomb>&d;</bomb>
    78          
    79          
    80          quadratic blowup entity expansion
    81          ---------------------------------
    82          
    83          A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses
    84          entity expansion, too. Instead of nested entities it repeats one large entity
    85          with a couple of thousand chars over and over again. The attack isn't as
    86          efficient as the exponential case but it avoids triggering countermeasures of
    87          parsers against heavily nested entities. Some parsers limit the depth and
    88          breadth of a single entity but not the total amount of expanded text
    89          throughout an entire XML document.
    90          
    91          A medium-sized XML document with a couple of hundred kilobytes can require a
    92          couple of hundred MB to several GB of memory. When the attack is combined
    93          with some level of nested expansion an attacker is able to achieve a higher
    94          ratio of success.
    95          
    96          ::
    97          
    98              <!DOCTYPE bomb [
    99              <!ENTITY a "xxxxxxx... a couple of ten thousand chars">
   100              ]>
   101              <bomb>&a;&a;&a;... repeat</bomb>
   102          
   103          
   104          external entity expansion (remote)
   105          ----------------------------------
   106          
   107          Entity declarations can contain more than just text for replacement. They can
   108          also point to external resources by public identifiers or system identifiers.
   109          System identifiers are standard URIs. When the URI is a URL (e.g. a
   110          ``http://`` locator) some parsers download the resource from the remote
   111          location and embed them into the XML document verbatim.
   112          
   113          Simple example of a parsed external entity::
   114          
   115              <!DOCTYPE external [
   116              <!ENTITY ee SYSTEM "http://www.python.org/some.xml">
   117              ]>
   118              <root>&ee;</root>
   119          
   120          The case of parsed external entities works only for valid XML content. The
   121          XML standard also supports unparsed external entities with a
   122          ``NData declaration``.
   123          
   124          External entity expansion opens the door to plenty of exploits. An attacker
   125          can abuse a vulnerable XML library and application to rebound and forward
   126          network requests with the IP address of the server. It highly depends
   127          on the parser and the application what kind of exploit is possible. For
   128          example:
   129          
   130          * An attacker can circumvent firewalls and gain access to restricted
   131            resources as all the requests are made from an internal and trustworthy
   132            IP address, not from the outside.
   133          * An attacker can abuse a service to attack, spy on or DoS your servers but
   134            also third party services. The attack is disguised with the IP address of
   135            the server and the attacker is able to utilize the high bandwidth of a big
   136            machine.
   137          * An attacker can exhaust additional resources on the machine, e.g. with
   138            requests to a service that doesn't respond or responds with very large
   139            files.
   140          * An attacker may gain knowledge, when, how often and from which IP address
   141            a XML document is accessed.
   142          * An attacker could send mail from inside your network if the URL handler
   143            supports ``smtp://`` URIs.
   144          
   145          
   146          external entity expansion (local file)
   147          --------------------------------------
   148          
   149          External entities with references to local files are a sub-case of external
   150          entity expansion. It's listed as an extra attack because it deserves extra
   151          attention. Some XML libraries such as lxml disable network access by default
   152          but still allow entity expansion with local file access by default. Local
   153          files are either referenced with a ``file://`` URL or by a file path (either
   154          relative or absolute).
   155          
   156          An attacker may be able to access and download all files that can be read by
   157          the application process. This may include critical configuration files, too.
   158          
   159          ::
   160          
   161              <!DOCTYPE external [
   162              <!ENTITY ee SYSTEM "file:///PATH/TO/simple.xml">
   163              ]>
   164              <root>&ee;</root>
   165          
   166          
   167          DTD retrieval
   168          -------------
   169          
   170          This case is similar to external entity expansion, too. Some XML libraries
   171          like Python's xml.dom.pulldom retrieve document type definitions from remote
   172          or local locations. Several attack scenarios from the external entity case
   173          apply to this issue as well.
   174          
   175          ::
   176          
   177              <?xml version="1.0" encoding="utf-8"?>
   178              <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   179                "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   180              <html>
   181                  <head/>
   182                  <body>text</body>
   183              </html>
   184          
   185          
   186          Python XML Libraries
   187          ====================
   188          
   189          .. csv-table:: vulnerabilities and features
   190             :header: "kind", "sax", "etree", "minidom", "pulldom", "xmlrpc", "lxml", "genshi"
   191             :widths: 24, 7, 8, 8, 7, 8, 8, 8
   192             :stub-columns: 0
   193          
   194             "billion laughs", "**True**", "**True**", "**True**", "**True**", "**True**", "False (1)", "False (5)"
   195             "quadratic blowup", "**True**", "**True**", "**True**", "**True**", "**True**", "**True**", "False (5)"
   196             "external entity expansion (remote)", "**True**", "False (3)", "False (4)", "**True**", "false", "False (1)", "False (5)"
   197             "external entity expansion (local file)", "**True**", "False (3)", "False (4)", "**True**", "false", "**True**", "False (5)"
   198             "DTD retrieval", "**True**", "False", "False", "**True**", "false", "False (1)", "False"
   199             "gzip bomb", "False", "False", "False", "False", "**True**", "**partly** (2)", "False"
   200             "xpath support (7)", "False", "False", "False", "False", "False", "**True**", "False"
   201             "xsl(t) support (7)", "False", "False", "False", "False", "False", "**True**", "False"
   202             "xinclude support (7)", "False", "**True** (6)", "False", "False", "False", "**True** (6)", "**True**"
   203             "C library", "expat", "expat", "expat", "expat", "expat", "libxml2", "expat"
   204          
   205          1. Lxml is protected against billion laughs attacks and doesn't do network
   206             lookups by default.
   207          2. libxml2 and lxml are not directly vulnerable to gzip decompression bombs
   208             but they don't protect you against them either.
   209          3. xml.etree doesn't expand entities and raises a ParserError when an entity
   210             occurs.
   211          4. minidom doesn't expand entities and simply returns the unexpanded entity
   212             verbatim.
   213          5. genshi.input of genshi 0.6 doesn't support entity expansion and raises a
   214             ParserError when an entity occurs.
   215          6. Library has (limited) XInclude support but requires an additional step to
   216             process inclusion.
   217          7. These are features but they may introduce exploitable holes, see
   218             `Other things to consider`_
   219          
   220          
   221          Settings in standard library
   222          ----------------------------
   223          
   224          
   225          xml.sax.handler Features
   226          ........................
   227          
   228          feature_external_ges (http://xml.org/sax/features/external-general-entities)
   229            disables external entity expansion
   230          
   231          feature_external_pes (http://xml.org/sax/features/external-parameter-entities)
   232            the option is ignored and doesn't modify any functionality
   233          
   234          DOM xml.dom.xmlbuilder.Options
   235          ..............................
   236          
   237          external_parameter_entities
   238            ignored
   239          
   240          external_general_entities
   241            ignored
   242          
   243          external_dtd_subset
   244            ignored
   245          
   246          entities
   247            unsure
   248          
   249          
   250          defusedxml
   251          ==========
   252          
   253          The `defusedxml package`_ (`defusedxml on PyPI`_)
   254          contains several Python-only workarounds and fixes
   255          for denial of service and other vulnerabilities in Python's XML libraries.
   256          In order to benefit from the protection you just have to import and use the
   257          listed functions / classes from the right defusedxml module instead of the
   258          original module. Merely `defusedxml.xmlrpc`_ is implemented as monkey patch.
   259          
   260          Instead of::
   261          
   262             >>> from xml.etree.ElementTree import parse
   263             >>> et = parse(xmlfile)
   264          
   265          alter code to::
   266          
   267             >>> from defusedxml.ElementTree import parse
   268             >>> et = parse(xmlfile)
   269          
   270          Additionally the package has an **untested** function to monkey patch
   271          all stdlib modules with ``defusedxml.defuse_stdlib()``.
   272          
   273          All functions and parser classes accept three additional keyword arguments.
   274          They return either the same objects as the original functions or compatible
   275          subclasses.
   276          
   277          forbid_dtd (default: False)
   278            disallow XML with a ``<!DOCTYPE>`` processing instruction and raise a
   279            *DTDForbidden* exception when a DTD processing instruction is found.
   280          
   281          forbid_entities (default: True)
   282            disallow XML with ``<!ENTITY>`` declarations inside the DTD and raise an
   283            *EntitiesForbidden* exception when an entity is declared.
   284          
   285          forbid_external (default: True)
   286            disallow any access to remote or local resources in external entities
   287            or DTD and raising an *ExternalReferenceForbidden* exception when a DTD
   288            or entity references an external resource.
   289          
   290          
   291          defusedxml (package)
   292          --------------------
   293          
   294          DefusedXmlException, DTDForbidden, EntitiesForbidden,
   295          ExternalReferenceForbidden, NotSupportedError
   296          
   297          defuse_stdlib() (*experimental*)
   298          
   299          
   300          defusedxml.cElementTree
   301          -----------------------
   302          
   303          parse(), iterparse(), fromstring(), XMLParser
   304          
   305          
   306          defusedxml.ElementTree
   307          -----------------------
   308          
   309          parse(), iterparse(), fromstring(), XMLParser
   310          
   311          
   312          defusedxml.expatreader
   313          ----------------------
   314          
   315          create_parser(), DefusedExpatParser
   316          
   317          
   318          defusedxml.sax
   319          --------------
   320          
   321          parse(), parseString(), create_parser()
   322          
   323          
   324          defusedxml.expatbuilder
   325          -----------------------
   326          
   327          parse(), parseString(), DefusedExpatBuilder, DefusedExpatBuilderNS
   328          
   329          
   330          defusedxml.minidom
   331          ------------------
   332          
   333          parse(), parseString()
   334          
   335          
   336          defusedxml.pulldom
   337          ------------------
   338          
   339          parse(), parseString()
   340          
   341          
   342          defusedxml.xmlrpc
   343          -----------------
   344          
   345          The fix is implemented as monkey patch for the stdlib's xmlrpc package (3.x)
   346          or xmlrpclib module (2.x). The function `monkey_patch()` enables the fixes,
   347          `unmonkey_patch()` removes the patch and puts the code in its former state.
   348          
   349          The monkey patch protects against XML related attacks as well as
   350          decompression bombs and excessively large requests or responses. The default
   351          setting is 30 MB for requests, responses and gzip decompression. You can
   352          modify the default by changing the module variable `MAX_DATA`. A value of
   353          `-1` disables the limit.
   354          
   355          
   356          defusedxml.lxml
   357          ---------------
   358          
   359          The module acts as an *example* how you could protect code that uses
   360          lxml.etree. It implements a custom Element class that filters out
   361          Entity instances, a custom parser factory and a thread local storage for
   362          parser instances. It also has a check_docinfo() function which inspects
   363          a tree for internal or external DTDs and entity declarations. In order to
   364          check for entities lxml > 3.0 is required.
   365          
   366          parse(), fromstring()
   367          RestrictedElement, GlobalParserTLS, getDefaultParser(), check_docinfo()
   368          
   369          
   370          defusedexpat
   371          ============
   372          
   373          The `defusedexpat package`_ (`defusedexpat on PyPI`_)
   374          comes with binary extensions and a
   375          `modified expat`_ libary instead of the standard `expat parser`_. It's
   376          basically a stand-alone version of the patches for Python's standard
   377          library C extensions.
   378          
   379          Modifications in expat
   380          ----------------------
   381          
   382          new definitions::
   383          
   384            XML_BOMB_PROTECTION
   385            XML_DEFAULT_MAX_ENTITY_INDIRECTIONS
   386            XML_DEFAULT_MAX_ENTITY_EXPANSIONS
   387            XML_DEFAULT_RESET_DTD
   388          
   389          new XML_FeatureEnum members::
   390          
   391            XML_FEATURE_MAX_ENTITY_INDIRECTIONS
   392            XML_FEATURE_MAX_ENTITY_EXPANSIONS
   393            XML_FEATURE_IGNORE_DTD
   394          
   395          new XML_Error members::
   396          
   397            XML_ERROR_ENTITY_INDIRECTIONS
   398            XML_ERROR_ENTITY_EXPANSION
   399          
   400          new API functions::
   401          
   402            int XML_GetFeature(XML_Parser parser,
   403                               enum XML_FeatureEnum feature,
   404                               long *value);
   405            int XML_SetFeature(XML_Parser parser,
   406                               enum XML_FeatureEnum feature,
   407                               long value);
   408            int XML_GetFeatureDefault(enum XML_FeatureEnum feature,
   409                                      long *value);
   410            int XML_SetFeatureDefault(enum XML_FeatureEnum feature,
   411                                      long value);
   412          
   413          XML_FEATURE_MAX_ENTITY_INDIRECTIONS
   414             Limit the amount of indirections that are allowed to occur during the
   415             expansion of a nested entity. A counter starts when an entity reference
   416             is encountered. It resets after the entity is fully expanded. The limit
   417             protects the parser against exponential entity expansion attacks (aka
   418             billion laughs attack). When the limit is exceeded the parser stops and
   419             fails with `XML_ERROR_ENTITY_INDIRECTIONS`.
   420             A value of 0 disables the protection.
   421          
   422             Supported range
   423               0 .. UINT_MAX
   424             Default
   425               40
   426          
   427          XML_FEATURE_MAX_ENTITY_EXPANSIONS
   428             Limit the total length of all entity expansions throughout the entire
   429             document. The lengths of all entities are accumulated in a parser variable.
   430             The setting protects against quadratic blowup attacks (lots of expansions
   431             of a large entity declaration). When the sum of all entities exceeds
   432             the limit, the parser stops and fails with `XML_ERROR_ENTITY_EXPANSION`.
   433             A value of 0 disables the protection.
   434          
   435             Supported range
   436               0 .. UINT_MAX
   437             Default
   438               8 MiB
   439          
   440          XML_FEATURE_RESET_DTD
   441             Reset all DTD information after the <!DOCTYPE> block has been parsed. When
   442             the flag is set (default: false) all DTD information after the
   443             endDoctypeDeclHandler has been called. The flag can be set inside the
   444             endDoctypeDeclHandler. Without DTD information any entity reference in
   445             the document body leads to `XML_ERROR_UNDEFINED_ENTITY`.
   446          
   447             Supported range
   448               0, 1
   449             Default
   450               0
   451          
   452          
   453          How to avoid XML vulnerabilities
   454          ================================
   455          
   456          Best practices
   457          --------------
   458          
   459          * Don't allow DTDs
   460          * Don't expand entities
   461          * Don't resolve externals
   462          * Limit parse depth
   463          * Limit total input size
   464          * Limit parse time
   465          * Favor a SAX or iterparse-like parser for potential large data
   466          * Validate and properly quote arguments to XSL transformations and
   467            XPath queries
   468          * Don't use XPath expression from untrusted sources
   469          * Don't apply XSL transformations that come untrusted sources
   470          
   471          (based on Brad Hill's `Attacking XML Security`_)
   472          
   473          
   474          Other things to consider
   475          ========================
   476          
   477          XML, XML parsers and processing libraries have more features and possible
   478          issue that could lead to DoS vulnerabilities or security exploits in
   479          applications. I have compiled an incomplete list of theoretical issues that
   480          need further research and more attention. The list is deliberately pessimistic
   481          and a bit paranoid, too. It contains things that might go wrong under daffy
   482          circumstances.
   483          
   484          
   485          attribute blowup / hash collision attack
   486          ----------------------------------------
   487          
   488          XML parsers may use an algorithm with quadratic runtime O(n :sup:`2`) to
   489          handle attributes and namespaces. If it uses hash tables (dictionaries) to
   490          store attributes and namespaces the implementation may be vulnerable to
   491          hash collision attacks, thus reducing the performance to O(n :sup:`2`) again.
   492          In either case an attacker is able to forge a denial of service attack with
   493          an XML document that contains thousands upon thousands of attributes in
   494          a single node.
   495          
   496          I haven't researched yet if expat, pyexpat or libxml2 are vulnerable.
   497          
   498          
   499          decompression bomb
   500          ------------------
   501          
   502          The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries
   503          that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed
   504          files. For an attacker it can reduce the amount of transmitted data by three
   505          magnitudes or more. Gzip is able to compress 1 GiB zeros to roughly 1 MB,
   506          lzma is even better::
   507          
   508              $ dd if=/dev/zero bs=1M count=1024 | gzip > zeros.gz
   509              $ dd if=/dev/zero bs=1M count=1024 | lzma -z > zeros.xy
   510              $ ls -sh zeros.*
   511              1020K zeros.gz
   512               148K zeros.xy
   513          
   514          None of Python's standard XML libraries decompress streams except for
   515          ``xmlrpclib``. The module is vulnerable <http://bugs.python.org/issue16043>
   516          to decompression bombs.
   517          
   518          lxml can load and process compressed data through libxml2 transparently.
   519          libxml2 can handle even very large blobs of compressed data efficiently
   520          without using too much memory. But it doesn't protect applications from
   521          decompression bombs. A carefully written SAX or iterparse-like approach can
   522          be safe.
   523          
   524          
   525          Processing Instruction
   526          ----------------------
   527          
   528          `PI`_'s like::
   529          
   530            <?xml-stylesheet type="text/xsl" href="style.xsl"?>
   531          
   532          may impose more threats for XML processing. It depends if and how a
   533          processor handles processing instructions. The issue of URL retrieval with
   534          network or local file access apply to processing instructions, too.
   535          
   536          
   537          Other DTD features
   538          ------------------
   539          
   540          `DTD`_ has more features like ``<!NOTATION>``. I haven't researched how
   541          these features may be a security threat.
   542          
   543          
   544          XPath
   545          -----
   546          
   547          XPath statements may introduce DoS vulnerabilities. Code should never execute
   548          queries from untrusted sources. An attacker may also be able to create a XML
   549          document that makes certain XPath queries costly or resource hungry.
   550          
   551          
   552          XPath injection attacks
   553          -----------------------
   554          
   555          XPath injeciton attacks pretty much work like SQL injection attacks.
   556          Arguments to XPath queries must be quoted and validated properly, especially
   557          when they are taken from the user. The page `Avoid the dangers of XPath injection`_
   558          list some ramifications of XPath injections.
   559          
   560          Python's standard library doesn't have XPath support. Lxml supports
   561          parameterized XPath queries which does proper quoting. You just have to use
   562          its xpath() method correctly::
   563          
   564             # DON'T
   565             >>> tree.xpath("/tag[@id='%s']" % value)
   566          
   567             # instead do
   568             >>> tree.xpath("/tag[@id=$tagid]", tagid=name)
   569          
   570          
   571          XInclude
   572          --------
   573          
   574          `XML Inclusion`_ is another way to load and include external files::
   575          
   576             <root xmlns:xi="http://www.w3.org/2001/XInclude">
   577               <xi:include href="filename.txt" parse="text" />
   578             </root>
   579          
   580          This feature should be disabled when XML files from an untrusted source are
   581          processed. Some Python XML libraries and libxml2 support XInclude but don't
   582          have an option to sandbox inclusion and limit it to allowed directories.
   583          
   584          
   585          XMLSchema location
   586          ------------------
   587          
   588          A validating XML parser may download schema files from the information in a
   589          ``xsi:schemaLocation`` attribute.
   590          
   591          ::
   592          
   593            <ead xmlns="urn:isbn:1-931666-22-9"
   594                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   595                 xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd">
   596            </ead>
   597          
   598          
   599          XSL Transformation
   600          ------------------
   601          
   602          You should keep in mind that XSLT is a Turing complete language. Never
   603          process XSLT code from unknown or untrusted source! XSLT processors may
   604          allow you to interact with external resources in ways you can't even imagine.
   605          Some processors even support extensions that allow read/write access to file
   606          system, access to JRE objects or scripting with Jython.
   607          
   608          Example from `Attacking XML Security`_ for Xalan-J::
   609          
   610              <xsl:stylesheet version="1.0"
   611               xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   612               xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime"
   613               xmlns:ob="http://xml.apache.org/xalan/java/java.lang.Object"
   614               exclude-result-prefixes= "rt ob">
   615               <xsl:template match="/">
   616                 <xsl:variable name="runtimeObject" select="rt:getRuntime()"/>
   617                 <xsl:variable name="command"
   618                   select="rt:exec($runtimeObject, &apos;c:\Windows\system32\cmd.exe&apos;)"/>
   619                 <xsl:variable name="commandAsString" select="ob:toString($command)"/>
   620                 <xsl:value-of select="$commandAsString"/>
   621               </xsl:template>
   622              </xsl:stylesheet>
   623          
   624          
   625          Related CVEs
   626          ============
   627          
   628          CVE-2013-1664
   629            Unrestricted entity expansion induces DoS vulnerabilities in Python XML
   630            libraries (XML bomb)
   631          
   632          CVE-2013-1665
   633            External entity expansion in Python XML libraries inflicts potential
   634            security flaws and DoS vulnerabilities
   635          
   636          
   637          Other languages / frameworks
   638          =============================
   639          
   640          Several other programming languages and frameworks are vulnerable as well. A
   641          couple of them are affected by the fact that libxml2 up to 2.9.0 has no
   642          protection against quadratic blowup attacks. Most of them have potential
   643          dangerous default settings for entity expansion and external entities, too.
   644          
   645          Perl
   646          ----
   647          
   648          Perl's XML::Simple is vulnerable to quadratic entity expansion and external
   649          entity expansion (both local and remote).
   650          
   651          
   652          Ruby
   653          ----
   654          
   655          Ruby's REXML document parser is vulnerable to entity expansion attacks
   656          (both quadratic and exponential) but it doesn't do external entity
   657          expansion by default. In order to counteract entity expansion you have to
   658          disable the feature::
   659          
   660            REXML::Document.entity_expansion_limit = 0
   661          
   662          libxml-ruby and hpricot don't expand entities in their default configuration.
   663          
   664          
   665          PHP
   666          ---
   667          
   668          PHP's SimpleXML API is vulnerable to quadratic entity expansion and loads
   669          entites from local and remote resources. The option ``LIBXML_NONET`` disables
   670          network access but still allows local file access. ``LIBXML_NOENT`` seems to
   671          have no effect on entity expansion in PHP 5.4.6.
   672          
   673          
   674          C# / .NET / Mono
   675          ----------------
   676          
   677          Information in `XML DoS and Defenses (MSDN)`_ suggest that .NET is
   678          vulnerable with its default settings. The article contains code snippets
   679          how to create a secure XML reader::
   680          
   681            XmlReaderSettings settings = new XmlReaderSettings();
   682            settings.ProhibitDtd = false;
   683            settings.MaxCharactersFromEntities = 1024;
   684            settings.XmlResolver = null;
   685            XmlReader reader = XmlReader.Create(stream, settings);
   686          
   687          
   688          Java
   689          ----
   690          
   691          Untested. The documentation of Xerces and its `Xerces SecurityMananger`_
   692          sounds like Xerces is also vulnerable to billion laugh attacks with its
   693          default settings. It also does entity resolving when an
   694          ``org.xml.sax.EntityResolver`` is configured. I'm not yet sure about the
   695          default setting here.
   696          
   697          Java specialists suggest to have a custom builder factory::
   698          
   699            DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
   700            builderFactory.setXIncludeAware(False);
   701            builderFactory.setExpandEntityReferences(False);
   702            builderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, True);
   703            # either
   704            builderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", True);
   705            # or if you need DTDs
   706            builderFactory.setFeature("http://xml.org/sax/features/external-general-entities", False);
   707            builderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", False);
   708            builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", False);
   709            builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", False);
   710          
   711          
   712          TODO
   713          ====
   714          
   715          * DOM: Use xml.dom.xmlbuilder options for entity handling
   716          * SAX: take feature_external_ges and feature_external_pes (?) into account
   717          * test experimental monkey patching of stdlib modules
   718          * improve documentation
   719          
   720          
   721          License
   722          =======
   723          
   724          Copyright (c) 2013 by Christian Heimes <christian@python.org>
   725          
   726          Licensed to PSF under a Contributor Agreement.
   727          
   728          See http://www.python.org/psf/license for licensing details.
   729          
   730          
   731          Acknowledgements
   732          ================
   733          
   734          Brett Cannon (Python Core developer)
   735            review and code cleanup
   736          
   737          Antoine Pitrou (Python Core developer)
   738            code review
   739          
   740          Aaron Patterson, Ben Murphy and Michael Koziarski (Ruby community)
   741            Many thanks to Aaron, Ben and Michael from the Ruby community for their
   742            report and assistance.
   743          
   744          Thierry Carrez (OpenStack)
   745            Many thanks to Thierry for his report to the Python Security Response
   746            Team on behalf of the OpenStack security team.
   747          
   748          Carl Meyer (Django)
   749            Many thanks to Carl for his report to PSRT on behalf of the Django security
   750            team.
   751          
   752          Daniel Veillard (libxml2)
   753            Many thanks to Daniel for his insight and assistance with libxml2.
   754          
   755          semantics GmbH (http://www.semantics.de/)
   756            Many thanks to my employer semantics for letting me work on the issue
   757            during working hours as part of semantics's open source initiative.
   758          
   759          
   760          References
   761          ==========
   762          
   763          * `XML DoS and Defenses (MSDN)`_
   764          * `Billion Laughs`_ on Wikipedia
   765          * `ZIP bomb`_ on Wikipedia
   766          * `Configure SAX parsers for secure processing`_
   767          * `Testing for XML Injection`_
   768          
   769          .. _defusedxml package: https://bitbucket.org/tiran/defusedxml
   770          .. _defusedxml on PyPI: https://pypi.python.org/pypi/defusedxml
   771          .. _defusedexpat package: https://bitbucket.org/tiran/defusedexpat
   772          .. _defusedexpat on PyPI: https://pypi.python.org/pypi/defusedexpat
   773          .. _modified expat: https://bitbucket.org/tiran/expat
   774          .. _expat parser: http://expat.sourceforge.net/
   775          .. _Attacking XML Security: https://www.isecpartners.com/media/12976/iSEC-HILL-Attacking-XML-Security-bh07.pdf
   776          .. _Billion Laughs: http://en.wikipedia.org/wiki/Billion_laughs
   777          .. _XML DoS and Defenses (MSDN): http://msdn.microsoft.com/en-us/magazine/ee335713.aspx
   778          .. _ZIP bomb: http://en.wikipedia.org/wiki/Zip_bomb
   779          .. _DTD: http://en.wikipedia.org/wiki/Document_Type_Definition
   780          .. _PI: https://en.wikipedia.org/wiki/Processing_Instruction
   781          .. _Avoid the dangers of XPath injection: http://www.ibm.com/developerworks/xml/library/x-xpathinjection/index.html
   782          .. _Configure SAX parsers for secure processing: http://www.ibm.com/developerworks/xml/library/x-tipcfsx/index.html
   783          .. _Testing for XML Injection: https://www.owasp.org/index.php/Testing_for_XML_Injection_(OWASP-DV-008)
   784          .. _Xerces SecurityMananger: http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
   785          .. _XML Inclusion: http://www.w3.org/TR/xinclude/#include_element
   786          
   787          Changelog
   788          =========
   789          
   790          defusedxml 0.4.1
   791          ----------------
   792          
   793          *Release date: 28-Mar-2013*
   794          
   795          - Add more demo exploits, e.g. python_external.py and Xalan XSLT demos.
   796          - Improved documentation.
   797          
   798          
   799          defusedxml 0.4
   800          --------------
   801          
   802          *Release date: 25-Feb-2013*
   803          
   804          - As per http://seclists.org/oss-sec/2013/q1/340 please REJECT
   805            CVE-2013-0278, CVE-2013-0279 and CVE-2013-0280 and use CVE-2013-1664,
   806            CVE-2013-1665 for OpenStack/etc.
   807          - Add missing parser_list argument to sax.make_parser(). The argument is
   808            ignored, though. (thanks to Florian Apolloner)
   809          - Add demo exploit for external entity attack on Python's SAX parser, XML-RPC
   810            and WebDAV.
   811          
   812          
   813          defusedxml 0.3
   814          --------------
   815          
   816          *Release date: 19-Feb-2013*
   817          
   818          - Improve documentation
   819          
   820          
   821          defusedxml 0.2
   822          --------------
   823          
   824          *Release date: 15-Feb-2013*
   825          
   826          - Rename ExternalEntitiesForbidden to ExternalReferenceForbidden
   827          - Rename defusedxml.lxml.check_dtd() to check_docinfo()
   828          - Unify argument names in callbacks
   829          - Add arguments and formatted representation to exceptions
   830          - Add forbid_external argument to all functions and classs
   831          - More tests
   832          - LOTS of documentation
   833          - Add example code for other languages (Ruby, Perl, PHP) and parsers (Genshi)
   834          - Add protection against XML and gzip attacks to xmlrpclib
   835          
   836          defusedxml 0.1
   837          --------------
   838          
   839          *Release date: 08-Feb-2013*
   840          
   841          - Initial and internal release for PSRT review
   842          
   843  Keywords: xml bomb DoS
   844  Platform: all
   845  Classifier: Development Status :: 5 - Production/Stable
   846  Classifier: Intended Audience :: Developers
   847  Classifier: License :: OSI Approved :: Python Software Foundation License
   848  Classifier: Natural Language :: English
   849  Classifier: Programming Language :: Python
   850  Classifier: Programming Language :: Python :: 2
   851  Classifier: Programming Language :: Python :: 2.6
   852  Classifier: Programming Language :: Python :: 2.7
   853  Classifier: Programming Language :: Python :: 3
   854  Classifier: Programming Language :: Python :: 3.1
   855  Classifier: Programming Language :: Python :: 3.2
   856  Classifier: Programming Language :: Python :: 3.3
   857  Classifier: Topic :: Text Processing :: Markup :: XML