namannandan
diff --git a/‎docs/batch_inference_with_ts.md
+120-2 b/‎docs/batch_inference_with_ts.md
+120-2
diff --git a/‎examples/image_classifier/resnet_152_batch/README.md
+60 b/‎examples/image_classifier/resnet_152_batch/README.md
+60
diff --git a/‎examples/image_classifier/resnet_152_batch/images/croco.jpg
18 KB b/‎examples/image_classifier/resnet_152_batch/images/croco.jpg
18 KB
diff --git a/‎examples/image_classifier/resnet_152_batch/images/dog.jpg
61.2 KB b/‎examples/image_classifier/resnet_152_batch/images/dog.jpg
61.2 KB
diff --git a/‎examples/image_classifier/resnet_152_batch/images/kitten.jpg
108 KB b/‎examples/image_classifier/resnet_152_batch/images/kitten.jpg
108 KB
diff --git a/‎examples/image_classifier/resnet_152_batch/index_to_name.json
+1 b/‎examples/image_classifier/resnet_152_batch/index_to_name.json
+1
diff --git a/‎examples/image_classifier/resnet_152_batch/model.py
+6 b/‎examples/image_classifier/resnet_152_batch/model.py
+6
diff --git a/‎examples/image_classifier/resnet_152_batch/resnet152_handler.py
+126 b/‎examples/image_classifier/resnet_152_batch/resnet152_handler.py
+126
@@ -2,6 +2,7 @@
 
 ## Contents of this Document
 * [Introduction](#introduction)
+
 * [Conclusion](#conclusion)   
 
 ## Introduction
@@ -16,14 +17,131 @@ Before jumping into this document, please go over the following docs
 1. [What is TorchServe?](../README.md)
 1. [What is custom service code?](custom_service.md)
 
-## Batch Inference with TorchServe
+## Batch Inference with TorchServe using ResNet-152 model
 To support batching of inference requests, TorchServe needs the following:
 1. TorchServe Model Configuration: TorchServe provides means to configure "Max Batch Size" and "Max Batch Delay" through "POST /models" API. 
    TorchServe needs to know the maximum batch size that the model can handle and the maximum delay that TorchServe should wait for, to form this request-batch. 
 2. Model Handler code: TorchServe requires the Model Handler to handle the batch of inference requests. 
 
-## TODO : Add detailed example with pytorch model.
+For a full working code of a custom model handler with batch processing, refer to [resnet152_handler.py](../examples/image_classifier/resnet_152_batch/resnet152_handler.py)
+
+### TorchServe Model Configuration
+To configure TorchServe to use the batching feature, you would have to provide the batch configuration information through [**POST /models** API](management_api.md#register-a-model).
+The configuration that we are interested in is the following: 
+1. `batch_size`: This is the maximum batch size that a model is expected to handle. 
+2. `max_batch_delay`: This is the maximum batch delay time TorchServe waits to receive `batch_size` number of requests. If TorchServe doesn't receive `batch_size` number of requests
+before this timer time's out, it sends what ever requests that were received to the model `handler`.
+
+Let's look at an example using this configuration
+```bash
+# The following command will register a model "resnet-152.mar" and configure TorchServe to use a batch_size of 8 and a max batch delay of 50 milli seconds. 
+curl -X POST "localhost:8081/models?url=resnet-152.mar&batch_size=8&max_batch_delay=50"
+```
+ 
+These configurations are used both in TorchServe and in the model's custom-service-code (a.k.a the handler code). TorchServe associates the batch related configuration with each model. The frontend then tries to aggregate the batch-size number of requests and send it to the backend.
+
+## Demo to configure TorchServe with batch-supported model
+In this section lets bring up model server and launch Resnet-152 model, which has been built to handle a batch of request. 
+
+### Pre-requisites
+Follow the main [Readme](../README.md) and install all the required packages including "torchserve"
+
+### Loading Resnet-152 which handles batch inferences
+* Start the model server. In this example, we are starting the model server to run on inference port 8080 and management port 8081.
+```text
+$ cat config.properties
+...
+inference_address=http://0.0.0.0:8080
+management_address=http://0.0.0.0:8081
+...
+$ torchserve --start --model-store model_store
+```
+
+Note :  This example assumes that the resnet-152.mar file is available in the torchserve model_store. For more details on creating resnet-152 mar file and serving it on TorchServe refer [resnet152 image classification example](../examples/image_classifier/resnet_152_batch/README.md)
+
+* Verify that the TorchServe is up and running
+```text
+$ curl localhost:8080/ping
+{
+  "status": "Healthy"
+}
+```
+
+* Now lets launch resnet-152 model, which we have built to handle batch inference. Since this is an example, we are going to launch 1 worker which handles a batch size of 8
+with a max-batch-delay of 10ms. 
+```text
+$ curl -X POST "localhost:8081/models?url=resnet-152.mar&batch_size=8&max_batch_delay=10&initial_workers=1"
+{
+  "status": "Processing worker updates..."
+}
+```
+
+* Verify that the workers were started properly
+```text
+$ curl localhost:8081/models/resnet-152
+{
+  "modelName": "resnet-152",
+  "modelUrl": "https://s3.amazonaws.com/model-server/model_archive_1.0/examples/resnet-152-batching/resnet-152.mar",
+  "runtime": "python",
+  "minWorkers": 1,
+  "maxWorkers": 1,
+  "batchSize": 8,
+  "maxBatchDelay": 10,
+  "workers": [
+    {
+      "id": "9008",
+      "startTime": "2019-02-19T23:56:33.907Z",
+      "status": "READY",
+      "gpu": false,
+      "memoryUsage": 607715328
+    }
+  ]
+}
+```
 
+* Now let's test this service. 
+  * Get an image to test this service
+    ```text
+    $ curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg
+    ``` 
+  * Run inference to test the model
+    ```text
+      $ curl -X POST localhost/predictions/resnet-152 -T kitten.jpg
+      {
+        "probability": 0.7148938179016113,
+        "class": "n02123045 tabby, tabby cat"
+      },
+      {
+        "probability": 0.22877725958824158,
+        "class": "n02123159 tiger cat"
+      },
+      {
+        "probability": 0.04032370448112488,
+        "class": "n02124075 Egyptian cat"
+      },
+      {
+        "probability": 0.00837081391364336,
+        "class": "n02127052 lynx, catamount"
+      },
+      {
+        "probability": 0.0006728120497427881,
+        "class": "n02129604 tiger, Panthera tigris"
+      }
+    ```
+    
+* Now that we have the service up and running, we could run performance tests with the same kitten image as follows. There are multiple tools to measure performance of web-servers. We will use 
+[apache-bench](https://httpd.apache.org/docs/2.4/programs/ab.html) to run our performance tests. We chose `apache-bench` for our tests because of the ease of installation and ease of running tests.
+Before running this test, we need to first install `apache-bench` on our System. Since we were running this on a ubuntu host, we installed apache-bench as follows
+```bash
+$ sudo apt-get udpate && sudo apt-get install apache2-utils
+```   
+Now that installation is done, we can run performance benchmark test as follows. 
+```text
+$ ab -k -l -n 10000 -c 1000 -T "image/jpeg" -p kitten.jpg localhost:8080/predictions/resnet-152
+```
+The above test simulates TorchServe receiving 1000 concurrent requests at once and a total of 10,000 requests. All of these requests are directed to the endpoint "localhost:8080/predictions/resnet-152", which assumes
+that resnet-152 is already registered and scaled-up on TorchServe. We had done this registration and scaling up in the above steps.
+ 
 ## Conclusion
 The take away from the experiments is that batching is a very useful feature. In cases where the services receive heavy load of requests or each request has high I/O, its advantageous
 to batch the requests. This allows for maximally utilizing the compute resources, especially GPU compute which are also more often than not more expensive. But customers should
 
@@ -0,0 +1,60 @@
+#### Sample commands to create a resnet-152 eager mode model archive for batch inputs, register it on TorchServe and run image prediction
+
+```bash
+wget https://download.pytorch.org/models/resnet152-b121ed2d.pth
+torch-model-archiver --model-name resnet-152-batch --version 1.0 --model-file serve/examples/image_classifier/resnet_152_batch/model.py --serialized-file resnet152-b121ed2d.pth --handler serve/examples/image_classifier/resnet_152_batch/resnet152_handler.py --extra-files serve/examples/image_classifier/index_to_name.json
+mkdir model-store
+mv resnet-152-batch.mar model-store/
+torchserve --start --model-store model-store
+curl -X POST curl -X POST "localhost:8081/models?model_name=resnet152&url=resnet-152-batch.mar&batch_size=4&max_batch_delay=5000&initial_workers=3&synchronous=true"
+```
+
+The above commands will create the mar file and register the resnet152 model with torchserve with following configuration :
+
+ - model_name : resnet152
+ - batch_size : 4
+ - max_batch_delay : 5000 ms
+ - workers : 3
+ 
+To test batch inference execute the following commands within the specified max_batch_delay time :
+
+```bash
+curl -X POST http://127.0.0.1:8080/predictions/resnet152 -T serve/examples/image_classifier/resnet_152_batch/images/croco.jpg &
+curl -X POST http://127.0.0.1:8080/predictions/resnet152 -T serve/examples/image_classifier/resnet_152_batch/images/dog.jpg &
+curl -X POST http://127.0.0.1:8080/predictions/resnet152 -T serve/examples/image_classifier/resnet_152_batch/images/kitten.jpg &
+```
+
+#### TorchScript example using Resnet152 image classifier:
+
+* Save the Resnet152-batch model in as an executable script module or a traced script:
+
+1. Save model using scripting
+   ```python
+   #scripted mode
+   from torchvision import models
+   import torch
+   model = models.resnet152(pretrained=True)
+   sm = torch.jit.script(model)
+   sm.save("resnet-152-batch.pt")
+   ```
+
+2. Save model using tracing
+   ```python
+   #traced mode
+   from torchvision import models
+   import torch
+   model = models.resnet152(pretrained=True)
+   example_input = torch.rand(1, 3, 224, 224)
+   traced_script_module = torch.jit.trace(model, example_input)
+   traced_script_module.save("resnet-152-batch.pt")
+   ```  
+ 
+* Use following commands to register Resnet152-batch torchscript model on TorchServe and run image prediction
+
+    ```bash
+    torch-model-archiver --model-name resnet-152-batch --version 1.0  --serialized-file resnet-152-batch.pt --extra-files serve/examples/image_classifier/index_to_name.json --handler image_classifier
+    mkdir model-store
+    mv resnet-152-batch.mar model-store/
+    torchserve --start --model-store model-store --models resnet-152-batch=resnet-152-batch.mar
+    curl -X POST http://127.0.0.1:8080/predictions/resnet-152-batch -T serve/examples/image_classifier/kitten.jpg
+    ```
@@ -0,0 +1,6 @@
+from torchvision.models.resnet import ResNet, Bottleneck
+
+
+class RestNet152ImageClassifier(ResNet):
+    def __init__(self):
+        super(RestNet152ImageClassifier, self).__init__(Bottleneck, [3, 8, 36, 3])
@@ -0,0 +1,126 @@
+import io
+import logging
+import numpy as np
+import os
+import torch
+from PIL import Image
+from torch.autograd import Variable
+from torchvision import transforms
+
+logger = logging.getLogger(__name__)
+
+
+class BatchImageClassifier(object):
+    """
+    BatchImageClassifier handler class. This handler takes list of images
+    and returns a corresponding list of classes
+    """
+
+    def __init__(self):
+        self.model = None
+        self.mapping = None
+        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+        self.initialized = False
+
+    def initialize(self, context):
+        """First try to load torchscript else load eager mode state_dict based model"""
+
+        self.manifest = context.manifest
+        properties = context.system_properties
+        model_dir = properties.get("model_dir")
+
+        # Read model serialize/pt file
+        serialized_file = self.manifest['model']['serializedFile']
+        model_pt_path = os.path.join(model_dir, serialized_file)
+        if not os.path.isfile(model_pt_path):
+            raise RuntimeError("Missing the model.pt file")
+
+        try:
+            logger.debug('Loading torchscript model')
+            self.model = torch.jit.load(model_pt_path)
+        except Exception as e:
+            # Read model definition file
+            model_file = self.manifest['model']['modelFile']
+            model_def_path = os.path.join(model_dir, model_file)
+            if not os.path.isfile(model_def_path):
+                raise RuntimeError("Missing the model.py file")
+
+            state_dict = torch.load(model_pt_path, map_location=self.device)
+            from model import RestNet152ImageClassifier
+            self.model = RestNet152ImageClassifier()
+            self.model.load_state_dict(state_dict)
+
+        self.model.eval()
+        logger.debug('Model file {0} loaded successfully'.format(model_pt_path))
+
+        # Read the mapping file, index to object name
+        mapping_file_path = os.path.join(model_dir, "index_to_name.json")
+        import json
+        if os.path.isfile(mapping_file_path):
+            with open(mapping_file_path) as f:
+                self.mapping = json.load(f)
+        else:
+            logger.warning('Missing the index_to_name.json file. Inference output will not include class name.')
+
+        self.initialized = True
+
+    def preprocess(self, request):
+        """
+         Scales, crops, and normalizes a PIL image for a PyTorch model,
+         returns an Numpy array
+        """
+
+        image_tensor = None
+
+        for idx, data in enumerate(request):
+            image = data.get("data")
+            if image is None:
+                image = data.get("body")
+
+            my_preprocess = transforms.Compose([
+                transforms.Resize(256),
+                transforms.CenterCrop(224),
+                transforms.ToTensor(),
+                transforms.Normalize(mean=[0.485, 0.456, 0.406],
+                                     std=[0.229, 0.224, 0.225])
+            ])
+            input_image = Image.open(io.BytesIO(image))
+            input_image = my_preprocess(input_image).unsqueeze(0)
+
+            if input_image.shape is not None:
+                if image_tensor is None:
+                    image_tensor = input_image
+                else:
+                    image_tensor = torch.cat((image_tensor, input_image), 0)
+
+        return image_tensor
+
+    def inference(self, img):
+        return self.model.forward(img)
+
+    def postprocess(self, inference_output):
+        num_rows, num_cols = inference_output.shape
+        output_classes = []
+        for i in range(num_rows):
+            out = inference_output[i].unsqueeze(0)
+            _, y_hat = out.max(1)
+            predicted_idx = str(y_hat.item())
+            output_classes.append(self.mapping[predicted_idx])
+        return output_classes
+
+
+_service = BatchImageClassifier()
+
+
+def handle(data, context):
+    if not _service.initialized:
+        _service.initialize(context)
+
+    if data is None:
+        return None
+
+    data = _service.preprocess(data)
+    data = _service.inference(data)
+    data = _service.postprocess(data)
+
+    return data