github.com/johnnyeven/libtools@v0.0.0-20191126065708-61829c1adf46/third_party/examples/eager/spinn/README.md

github.com/johnnyeven/libtools@v0.0.0-20191126065708-61829c1adf46/third_party/examples/eager/spinn/README.md (about)

     1  # SPINN with TensorFlow eager execution
     2  
     3  SPINN, or Stack-Augmented Parser-Interpreter Neural Network, is a recursive
     4  neural network that utilizes syntactic parse information for natural language
     5  understanding.
     6  
     7  SPINN was originally described by:
     8  Bowman, S.R., Gauthier, J., Rastogi A., Gupta, R., Manning, C.D., & Potts, C.
     9    (2016). A Fast Unified Model for Parsing and Sentence Understanding.
    10    https://arxiv.org/abs/1603.06021
    11  
    12  Our implementation is based on @jekbradbury's PyTorch implementation at:
    13  https://github.com/jekbradbury/examples/blob/spinn/snli/spinn.py,
    14  
    15  which was released under the BSD 3-Clause License at:
    16  https://github.com/jekbradbury/examples/blob/spinn/LICENSE
    17  
    18  Other eager execution examples can be found under [tensorflow/contrib/eager/python/examples](../../../../tensorflow/contrib/eager/python/examples).
    19  
    20  ##  Content
    21  
    22  - [`data.py`](../../../../tensorflow/contrib/eager/python/examples/spinn/data.py): Pipeline for loading and preprocessing the
    23     [SNLI](https://nlp.stanford.edu/projects/snli/) data and
    24     [GloVe](https://nlp.stanford.edu/projects/glove/) word embedding, written
    25     using the [`tf.data`](https://www.tensorflow.org/guide/datasets)
    26     API.
    27  - [`spinn.py`](./spinn.py): Model definition and training routines.
    28    This example illustrates how one might perform the following actions with
    29    eager execution enabled:
    30    * defining a model consisting of a dynamic computation graph,
    31    * assigning operations to the CPU or GPU dependending on device availability,
    32    * training the model using the data from the `tf.data`-based pipeline,
    33    * obtaining metrics such as mean accuracy during training,
    34    * saving and loading checkpoints,
    35    * writing summaries for monitoring and visualization in TensorBoard.
    36  
    37  ## To run
    38  
    39  - Make sure you have installed TensorFlow release 1.5 or higher. Alternatively,
    40    you can use the latest `tf-nightly` or `tf-nightly-gpu` pip
    41    package to access the eager execution feature.
    42  
    43  - Download and extract the raw SNLI data and GloVe embedding vectors.
    44    For example:
    45  
    46    ```bash
    47    curl -fSsL https://nlp.stanford.edu/projects/snli/snli_1.0.zip --create-dirs -o /tmp/spinn-data/snli/snli_1.0.zip
    48    unzip -d /tmp/spinn-data/snli /tmp/spinn-data/snli/snli_1.0.zip
    49    curl -fSsL http://nlp.stanford.edu/data/glove.42B.300d.zip --create-dirs -o /tmp/spinn-data/glove/glove.42B.300d.zip
    50    unzip -d /tmp/spinn-data/glove /tmp/spinn-data/glove/glove.42B.300d.zip
    51    ```
    52  
    53  - Train model. E.g.,
    54  
    55    ```bash
    56    python spinn.py --data_root /tmp/spinn-data --logdir /tmp/spinn-logs
    57    ```
    58  
    59    During training, model checkpoints and TensorBoard summaries will be written
    60    periodically to the directory specified with the `--logdir` flag.
    61    The training script will reload a saved checkpoint from the directory if it
    62    can find one there.
    63  
    64    To view the summaries with TensorBoard:
    65  
    66    ```bash
    67    tensorboard --logdir /tmp/spinn-logs
    68    ```
    69  
    70  - After training, you may use the model to perform inference on input data in
    71    the SNLI data format. The premise and hypotheses sentences are specified with
    72    the command-line flags `--inference_premise` and `--inference_hypothesis`,
    73    respectively. Each sentence should include the words, as well as parentheses
    74    representing a binary parsing of the sentence. The words and parentheses
    75    should all be separated by spaces. For instance,
    76  
    77    ```bash
    78    python spinn.py --data_root /tmp/spinn-data --logdir /tmp/spinn-logs \
    79        --inference_premise '( ( The dog ) ( ( is running ) . ) )' \
    80        --inference_hypothesis '( ( The dog ) ( moves . ) )'
    81    ```
    82  
    83    which will generate an output like the following, due to the semantic
    84    consistency of the two sentences.
    85  
    86    ```none
    87    Inference logits:
    88      entailment:     1.101249 (winner)
    89      contradiction:  -2.374171
    90      neutral:        -0.296733
    91    ```
    92  
    93    By contrast, the following sentence pair:
    94  
    95    ```bash
    96    python spinn.py --data_root /tmp/spinn-data --logdir /tmp/spinn-logs \
    97        --inference_premise '( ( The dog ) ( ( is running ) . ) )' \
    98        --inference_hypothesis '( ( The dog ) ( rests . ) )'
    99    ```
   100  
   101    will give you an output like the following, due to the semantic
   102    contradiction of the two sentences.
   103  
   104    ```none
   105    Inference logits:
   106      entailment:     -1.070098
   107      contradiction:  2.798695 (winner)
   108      neutral:        -1.402287
   109    ```