github.com/alwaysproblem/mlserving-tutorial@v0.0.0-20221124033215-121cfddbfbf4/TFserving/ClientAPI/python/README.md (about)

     1  # Python API
     2  
     3  ## **Go through README.md on parent directory first**
     4  
     5  - enter python directory
     6  
     7    ```bash
     8    # assume that your are in the root of this repo
     9    $ cd python
    10    ```
    11  
    12  - setup environment
    13  
    14    ```bash
    15    $ pip install -U pip
    16    $ pip install numpy tensorflow tensorflow-serving-api grpcio
    17    ```
    18  
    19  ## gRPC API
    20  
    21  - request different model name
    22  
    23    ```bash
    24    $ python3 grpc_request.py -m Toy # python3 grpcRequest.py -m <model name>
    25    # outputs {
    26    #   key: "output_1"
    27    #   value {
    28    #     dtype: DT_FLOAT
    29    #     tensor_shape {
    30    #       dim {
    31    #         size: 2
    32    #       }
    33    #       dim {
    34    #         size: 1
    35    #       }
    36    #     }
    37    #     float_val: 0.9990350008010864
    38    #     float_val: 0.9997349381446838
    39    #   }
    40    # }
    41    # model_spec {
    42    #   name: "Toy"
    43    #   version {
    44    #     value: 2
    45    #   }
    46    #   signature_name: "serving_default"
    47    # }
    48    $ python3 grpc_request.py -m Toy_double
    49    # outputs {
    50    #   key: "output_1"
    51    #   value {
    52    #     dtype: DT_FLOAT
    53    #     tensor_shape {
    54    #       dim {
    55    #         size: 2
    56    #       }
    57    #       dim {
    58    #         size: 1
    59    #       }
    60    #     }
    61    #     float_val: 6.803016662597656
    62    #     float_val: 8.262093544006348
    63    #   }
    64    # }
    65    # model_spec {
    66    #   name: "Toy_double"
    67    #   version {
    68    #     value: 1
    69    #   }
    70    #   signature_name: "serving_default"
    71    # }
    72    ```
    73  
    74  - request different version through the version number
    75  
    76    ```bash
    77      $ python3 grpc_request.py -v 1 # python3 grpcRequest.py -v <version number>
    78      # outputs {
    79      #   key: "output_1"
    80      #   value {
    81      #     dtype: DT_FLOAT
    82      #     tensor_shape {
    83      #       dim {
    84      #         size: 2
    85      #       }
    86      #       dim {
    87      #         size: 1
    88      #       }
    89      #     }
    90      #     float_val: 10.805429458618164
    91      #     float_val: 14.010123252868652
    92      #   }
    93      # }
    94      # model_spec {
    95      #   name: "Toy"
    96      #   version {
    97      #     value: 1
    98      #   }
    99      #   signature_name: "serving_default"
   100      # }
   101      $ python3 grpc_request.py -v 2
   102      # outputs {
   103      #   key: "output_1"
   104      #   value {
   105      #     dtype: DT_FLOAT
   106      #     tensor_shape {
   107      #       dim {
   108      #         size: 2
   109      #       }
   110      #       dim {
   111      #         size: 1
   112      #       }
   113      #     }
   114      #     float_val: 0.9990350008010864
   115      #     float_val: 0.9997349381446838
   116      #   }
   117      # }
   118      # model_spec {
   119      #   name: "Toy"
   120      #   version {
   121      #     value: 2
   122      #   }
   123      #   signature_name: "serving_default"
   124      # }
   125    ```
   126  
   127  - request different version through the version annotation
   128  
   129    ```bash
   130    $ python3 grpc_request.py -l stable # python3 grpcRequest.py -l <model label or annotation>
   131    # outputs {
   132    #   key: "output_1"
   133    #   value {
   134    #     dtype: DT_FLOAT
   135    #     tensor_shape {
   136    #       dim {
   137    #         size: 2
   138    #       }
   139    #       dim {
   140    #         size: 1
   141    #       }
   142    #     }
   143    #     float_val: 10.805429458618164
   144    #     float_val: 14.010123252868652
   145    #   }
   146    # }
   147    # model_spec {
   148    #   name: "Toy"
   149    #   version {
   150    #     value: 1
   151    #   }
   152    #   signature_name: "serving_default"
   153    # }
   154    $ python3 grpc_request.py -l canary
   155    # outputs {
   156    #   key: "output_1"
   157    #   value {
   158    #     dtype: DT_FLOAT
   159    #     tensor_shape {
   160    #       dim {
   161    #         size: 2
   162    #       }
   163    #       dim {
   164    #         size: 1
   165    #       }
   166    #     }
   167    #     float_val: 0.9990350008010864
   168    #     float_val: 0.9997349381446838
   169    #   }
   170    # }
   171    # model_spec {
   172    #   name: "Toy"
   173    #   version {
   174    #     value: 2
   175    #   }
   176    #   signature_name: "serving_default"
   177    # }
   178    ```
   179  
   180  - request multiple task model
   181  
   182  - request model status
   183  
   184    ```bash
   185    $ python grpc_model_status.py -m Toy_double -v 1
   186    # model_version_status {
   187    #   version: 1
   188    #   state: AVAILABLE
   189    #   status {
   190    #   }
   191    # }
   192    ```
   193  
   194  - request model metadata
   195  
   196    ```bash
   197    $ python grpc_metadata.py
   198    # model_spec {
   199    #   name: "Toy"
   200    #   version {
   201    #     value: 2
   202    #   }
   203    # }
   204    # metadata {
   205    #   key: "signature_def"
   206    #   value {
   207    #     type_url: "type.googleapis.com/tensorflow.serving.SignatureDefMap"
   208    #     value: "\n\253\001\n\017serving_default\022\227\001\n;\n\007input_1\0220\n\031serving_default_input_1:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\002\022<\n\010output_1\0220\n\031StatefulPartitionedCall:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\001\032\032tensorflow/serving/predict\n>\n\025__saved_model_init_op\022%\022#\n\025__saved_model_init_op\022\n\n\004NoOp\032\002\030\001"
   209    #   }
   210    # }
   211    ```
   212  
   213  - reload model through gRPC API
   214  
   215    ```bash
   216    $ python grpc_reload_model.py -m Toy
   217    # model Toy reloaded sucessfully
   218    ```
   219  
   220  - request model log
   221  
   222    ```bash
   223    $ python grpc_request_log.py -m Toy
   224    # ********************************************request logs********************************************
   225    # predict_log {
   226    #   request {
   227    #     model_spec {
   228    #       name: "Toy"
   229    #       signature_name: "serving_default"
   230    #     }
   231    #     inputs {
   232    #       key: "input_1"
   233    #       value {
   234    #         dtype: DT_FLOAT
   235    #         tensor_shape {
   236    #           dim {
   237    #             size: 2
   238    #           }
   239    #           dim {
   240    #             size: 2
   241    #           }
   242    #         }
   243    #         tensor_content: "\000\000\200?\000\000\000@\000\000\200?\000\000@@"
   244    #       }
   245    #     }
   246    #     output_filter: "output_1"
   247    #   }
   248    # }
   249    # ************************************************end*************************************************
   250    # **********************************************outputs***********************************************
   251    # outputs {
   252    #   key: "output_1"
   253    #   value {
   254    #     dtype: DT_FLOAT
   255    #     tensor_shape {
   256    #       dim {
   257    #         size: 2
   258    #       }
   259    #       dim {
   260    #         size: 1
   261    #       }
   262    #     }
   263    #     float_val: 0.9990350008010864
   264    #     float_val: 0.9997349381446838
   265    #   }
   266    # }
   267    # model_spec {
   268    #   name: "Toy"
   269    #   version {
   270    #     value: 2
   271    #   }
   272    #   signature_name: "serving_default"
   273    # }
   274    # ************************************************end*************************************************
   275    ```
   276  
   277  - grpc API for python
   278    - [grpc API](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/apis)
   279  
   280    - predict.proto
   281  
   282      ```protobuf
   283      syntax = "proto3";
   284  
   285      package tensorflow.serving;
   286      option cc_enable_arenas = true;
   287  
   288      import "tensorflow/core/framework/tensor.proto";
   289      import "tensorflow_serving/apis/model.proto";
   290  
   291      // PredictRequest specifies which TensorFlow model to run, as well as
   292      // how inputs are mapped to tensors and how outputs are filtered before
   293      // returning to user.
   294      message PredictRequest {
   295        // Model Specification. If version is not specified, will use the latest
   296        // (numerical) version.
   297        ModelSpec model_spec = 1;  # for python `request.model_spec`
   298  
   299        // Input tensors.
   300        // Names of input tensor are alias names. The mapping from aliases to real
   301        // input tensor names is stored in the SavedModel export as a prediction
   302        // SignatureDef under the 'inputs' field.
   303        map<string, TensorProto> inputs = 2; # for python `request.input` dictionary
   304  
   305        // Output filter.
   306        // Names specified are alias names. The mapping from aliases to real output
   307        // tensor names is stored in the SavedModel export as a prediction
   308        // SignatureDef under the 'outputs' field.
   309        // Only tensors specified here will be run/fetched and returned, with the
   310        // exception that when none is specified, all tensors specified in the
   311        // named signature will be run/fetched and returned.
   312        repeated string output_filter = 3; # for python `request.output_filter` list need to append values.
   313      }
   314  
   315      // Response for PredictRequest on successful run.
   316      message PredictResponse {
   317        // Effective Model Specification used to process PredictRequest.
   318        ModelSpec model_spec = 2;
   319  
   320        // Output tensors.
   321        map<string, TensorProto> outputs = 1;
   322      }
   323      ```
   324  
   325    - model.proto
   326  
   327      ```protobuf
   328      syntax = "proto3";
   329  
   330      package tensorflow.serving;
   331      option cc_enable_arenas = true;
   332  
   333      import "google/protobuf/wrappers.proto";
   334  
   335      // Metadata for an inference request such as the model name and version.
   336      message ModelSpec {
   337        // Required servable name.
   338        string name = 1;
   339  
   340        // Optional choice of which version of the model to use.
   341        //
   342        // Recommended to be left unset in the common case. Should be specified only
   343        // when there is a strong version consistency requirement.
   344        //
   345        // When left unspecified, the system will serve the best available version.
   346        // This is typically the latest version, though during version transitions,
   347        // notably when serving on a fleet of instances, may be either the previous or
   348        // new version.
   349  
   350        # for this `request.model_spec.version.value`
   351        # or `request.model_spec.version_label`
   352        oneof version_choice {
   353          // Use this specific version number.
   354          google.protobuf.Int64Value version = 2;
   355  
   356          // Use the version associated with the given label.
   357          string version_label = 4;
   358        }
   359  
   360        // A named signature to evaluate. If unspecified, the default signature will
   361        // be used.
   362        # for python `request.model_spec.signature_name`
   363        string signature_name = 3;
   364      }
   365      ```
   366  
   367  - [RESTful API](https://www.tensorflow.org/tfx/serving/api_rest)
   368  
   369  - run post_request.py
   370  
   371    ```bash
   372    python3 post_request.py
   373    # this request is based on isntances
   374    # True
   375    # {
   376    #     "predictions": [[0.990161777], [0.99347043]
   377    #     ]
   378    # }
   379    # time consumption: 47.346710999999985ms
   380    # this request is based on inputs
   381    # True
   382    # {
   383    #     "outputs": [
   384    #         [
   385    #             0.985201657
   386    #         ],
   387    #         [
   388    #             0.99923408
   389    #         ]
   390    #     ]
   391    # }
   392    # time consumption: 6.932738000000049ms
   393    ```
   394  
   395  ## For production
   396  
   397  - [SavedModel Warmup](https://www.tensorflow.org/tfx/serving/saved_model_warmup)
   398  - `--enable_model_warmup`: Enables model warmup using user-provided PredictionLogs in assets.extra/ directory
   399  
   400  <!-- TODO: need to decode the tfrecord file from serilized logs and check if it is right.-->