github.com/alwaysproblem/mlserving-tutorial@v0.0.0-20221124033215-121cfddbfbf4/TFserving/ClientAPI/python/README.md (about) 1 # Python API 2 3 ## **Go through README.md on parent directory first** 4 5 - enter python directory 6 7 ```bash 8 # assume that your are in the root of this repo 9 $ cd python 10 ``` 11 12 - setup environment 13 14 ```bash 15 $ pip install -U pip 16 $ pip install numpy tensorflow tensorflow-serving-api grpcio 17 ``` 18 19 ## gRPC API 20 21 - request different model name 22 23 ```bash 24 $ python3 grpc_request.py -m Toy # python3 grpcRequest.py -m <model name> 25 # outputs { 26 # key: "output_1" 27 # value { 28 # dtype: DT_FLOAT 29 # tensor_shape { 30 # dim { 31 # size: 2 32 # } 33 # dim { 34 # size: 1 35 # } 36 # } 37 # float_val: 0.9990350008010864 38 # float_val: 0.9997349381446838 39 # } 40 # } 41 # model_spec { 42 # name: "Toy" 43 # version { 44 # value: 2 45 # } 46 # signature_name: "serving_default" 47 # } 48 $ python3 grpc_request.py -m Toy_double 49 # outputs { 50 # key: "output_1" 51 # value { 52 # dtype: DT_FLOAT 53 # tensor_shape { 54 # dim { 55 # size: 2 56 # } 57 # dim { 58 # size: 1 59 # } 60 # } 61 # float_val: 6.803016662597656 62 # float_val: 8.262093544006348 63 # } 64 # } 65 # model_spec { 66 # name: "Toy_double" 67 # version { 68 # value: 1 69 # } 70 # signature_name: "serving_default" 71 # } 72 ``` 73 74 - request different version through the version number 75 76 ```bash 77 $ python3 grpc_request.py -v 1 # python3 grpcRequest.py -v <version number> 78 # outputs { 79 # key: "output_1" 80 # value { 81 # dtype: DT_FLOAT 82 # tensor_shape { 83 # dim { 84 # size: 2 85 # } 86 # dim { 87 # size: 1 88 # } 89 # } 90 # float_val: 10.805429458618164 91 # float_val: 14.010123252868652 92 # } 93 # } 94 # model_spec { 95 # name: "Toy" 96 # version { 97 # value: 1 98 # } 99 # signature_name: "serving_default" 100 # } 101 $ python3 grpc_request.py -v 2 102 # outputs { 103 # key: "output_1" 104 # value { 105 # dtype: DT_FLOAT 106 # tensor_shape { 107 # dim { 108 # size: 2 109 # } 110 # dim { 111 # size: 1 112 # } 113 # } 114 # float_val: 0.9990350008010864 115 # float_val: 0.9997349381446838 116 # } 117 # } 118 # model_spec { 119 # name: "Toy" 120 # version { 121 # value: 2 122 # } 123 # signature_name: "serving_default" 124 # } 125 ``` 126 127 - request different version through the version annotation 128 129 ```bash 130 $ python3 grpc_request.py -l stable # python3 grpcRequest.py -l <model label or annotation> 131 # outputs { 132 # key: "output_1" 133 # value { 134 # dtype: DT_FLOAT 135 # tensor_shape { 136 # dim { 137 # size: 2 138 # } 139 # dim { 140 # size: 1 141 # } 142 # } 143 # float_val: 10.805429458618164 144 # float_val: 14.010123252868652 145 # } 146 # } 147 # model_spec { 148 # name: "Toy" 149 # version { 150 # value: 1 151 # } 152 # signature_name: "serving_default" 153 # } 154 $ python3 grpc_request.py -l canary 155 # outputs { 156 # key: "output_1" 157 # value { 158 # dtype: DT_FLOAT 159 # tensor_shape { 160 # dim { 161 # size: 2 162 # } 163 # dim { 164 # size: 1 165 # } 166 # } 167 # float_val: 0.9990350008010864 168 # float_val: 0.9997349381446838 169 # } 170 # } 171 # model_spec { 172 # name: "Toy" 173 # version { 174 # value: 2 175 # } 176 # signature_name: "serving_default" 177 # } 178 ``` 179 180 - request multiple task model 181 182 - request model status 183 184 ```bash 185 $ python grpc_model_status.py -m Toy_double -v 1 186 # model_version_status { 187 # version: 1 188 # state: AVAILABLE 189 # status { 190 # } 191 # } 192 ``` 193 194 - request model metadata 195 196 ```bash 197 $ python grpc_metadata.py 198 # model_spec { 199 # name: "Toy" 200 # version { 201 # value: 2 202 # } 203 # } 204 # metadata { 205 # key: "signature_def" 206 # value { 207 # type_url: "type.googleapis.com/tensorflow.serving.SignatureDefMap" 208 # value: "\n\253\001\n\017serving_default\022\227\001\n;\n\007input_1\0220\n\031serving_default_input_1:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\002\022<\n\010output_1\0220\n\031StatefulPartitionedCall:0\020\001\032\021\022\013\010\377\377\377\377\377\377\377\377\377\001\022\002\010\001\032\032tensorflow/serving/predict\n>\n\025__saved_model_init_op\022%\022#\n\025__saved_model_init_op\022\n\n\004NoOp\032\002\030\001" 209 # } 210 # } 211 ``` 212 213 - reload model through gRPC API 214 215 ```bash 216 $ python grpc_reload_model.py -m Toy 217 # model Toy reloaded sucessfully 218 ``` 219 220 - request model log 221 222 ```bash 223 $ python grpc_request_log.py -m Toy 224 # ********************************************request logs******************************************** 225 # predict_log { 226 # request { 227 # model_spec { 228 # name: "Toy" 229 # signature_name: "serving_default" 230 # } 231 # inputs { 232 # key: "input_1" 233 # value { 234 # dtype: DT_FLOAT 235 # tensor_shape { 236 # dim { 237 # size: 2 238 # } 239 # dim { 240 # size: 2 241 # } 242 # } 243 # tensor_content: "\000\000\200?\000\000\000@\000\000\200?\000\000@@" 244 # } 245 # } 246 # output_filter: "output_1" 247 # } 248 # } 249 # ************************************************end************************************************* 250 # **********************************************outputs*********************************************** 251 # outputs { 252 # key: "output_1" 253 # value { 254 # dtype: DT_FLOAT 255 # tensor_shape { 256 # dim { 257 # size: 2 258 # } 259 # dim { 260 # size: 1 261 # } 262 # } 263 # float_val: 0.9990350008010864 264 # float_val: 0.9997349381446838 265 # } 266 # } 267 # model_spec { 268 # name: "Toy" 269 # version { 270 # value: 2 271 # } 272 # signature_name: "serving_default" 273 # } 274 # ************************************************end************************************************* 275 ``` 276 277 - grpc API for python 278 - [grpc API](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/apis) 279 280 - predict.proto 281 282 ```protobuf 283 syntax = "proto3"; 284 285 package tensorflow.serving; 286 option cc_enable_arenas = true; 287 288 import "tensorflow/core/framework/tensor.proto"; 289 import "tensorflow_serving/apis/model.proto"; 290 291 // PredictRequest specifies which TensorFlow model to run, as well as 292 // how inputs are mapped to tensors and how outputs are filtered before 293 // returning to user. 294 message PredictRequest { 295 // Model Specification. If version is not specified, will use the latest 296 // (numerical) version. 297 ModelSpec model_spec = 1; # for python `request.model_spec` 298 299 // Input tensors. 300 // Names of input tensor are alias names. The mapping from aliases to real 301 // input tensor names is stored in the SavedModel export as a prediction 302 // SignatureDef under the 'inputs' field. 303 map<string, TensorProto> inputs = 2; # for python `request.input` dictionary 304 305 // Output filter. 306 // Names specified are alias names. The mapping from aliases to real output 307 // tensor names is stored in the SavedModel export as a prediction 308 // SignatureDef under the 'outputs' field. 309 // Only tensors specified here will be run/fetched and returned, with the 310 // exception that when none is specified, all tensors specified in the 311 // named signature will be run/fetched and returned. 312 repeated string output_filter = 3; # for python `request.output_filter` list need to append values. 313 } 314 315 // Response for PredictRequest on successful run. 316 message PredictResponse { 317 // Effective Model Specification used to process PredictRequest. 318 ModelSpec model_spec = 2; 319 320 // Output tensors. 321 map<string, TensorProto> outputs = 1; 322 } 323 ``` 324 325 - model.proto 326 327 ```protobuf 328 syntax = "proto3"; 329 330 package tensorflow.serving; 331 option cc_enable_arenas = true; 332 333 import "google/protobuf/wrappers.proto"; 334 335 // Metadata for an inference request such as the model name and version. 336 message ModelSpec { 337 // Required servable name. 338 string name = 1; 339 340 // Optional choice of which version of the model to use. 341 // 342 // Recommended to be left unset in the common case. Should be specified only 343 // when there is a strong version consistency requirement. 344 // 345 // When left unspecified, the system will serve the best available version. 346 // This is typically the latest version, though during version transitions, 347 // notably when serving on a fleet of instances, may be either the previous or 348 // new version. 349 350 # for this `request.model_spec.version.value` 351 # or `request.model_spec.version_label` 352 oneof version_choice { 353 // Use this specific version number. 354 google.protobuf.Int64Value version = 2; 355 356 // Use the version associated with the given label. 357 string version_label = 4; 358 } 359 360 // A named signature to evaluate. If unspecified, the default signature will 361 // be used. 362 # for python `request.model_spec.signature_name` 363 string signature_name = 3; 364 } 365 ``` 366 367 - [RESTful API](https://www.tensorflow.org/tfx/serving/api_rest) 368 369 - run post_request.py 370 371 ```bash 372 python3 post_request.py 373 # this request is based on isntances 374 # True 375 # { 376 # "predictions": [[0.990161777], [0.99347043] 377 # ] 378 # } 379 # time consumption: 47.346710999999985ms 380 # this request is based on inputs 381 # True 382 # { 383 # "outputs": [ 384 # [ 385 # 0.985201657 386 # ], 387 # [ 388 # 0.99923408 389 # ] 390 # ] 391 # } 392 # time consumption: 6.932738000000049ms 393 ``` 394 395 ## For production 396 397 - [SavedModel Warmup](https://www.tensorflow.org/tfx/serving/saved_model_warmup) 398 - `--enable_model_warmup`: Enables model warmup using user-provided PredictionLogs in assets.extra/ directory 399 400 <!-- TODO: need to decode the tfrecord file from serilized logs and check if it is right.-->