github.com/instill-ai/component@v0.16.0-beta/pkg/connector/huggingface/v0/README.mdx (about) 1 --- 2 title: "Hugging Face" 3 lang: "en-US" 4 draft: false 5 description: "Learn about how to set up a VDP Hugging Face connector https://github.com/instill-ai/instill-core" 6 --- 7 8 The Hugging Face component is an AI connector that allows users to connect the AI models served on the Hugging Face Platform. 9 It can carry out the following tasks: 10 11 - [Text Generation](#text-generation) 12 - [Fill Mask](#fill-mask) 13 - [Summarization](#summarization) 14 - [Text Classification](#text-classification) 15 - [Token Classification](#token-classification) 16 - [Translation](#translation) 17 - [Zero Shot Classification](#zero-shot-classification) 18 - [Question Answering](#question-answering) 19 - [Table Question Answering](#table-question-answering) 20 - [Sentence Similarity](#sentence-similarity) 21 - [Conversational](#conversational) 22 - [Image Classification](#image-classification) 23 - [Image Segmentation](#image-segmentation) 24 - [Object Detection](#object-detection) 25 - [Image To Text](#image-to-text) 26 - [Speech Recognition](#speech-recognition) 27 - [Audio Classification](#audio-classification) 28 29 ## Release Stage 30 31 `Alpha` 32 33 ## Configuration 34 35 The component configuration is defined and maintained [here](https://github.com/instill-ai/component/blob/main/pkg/connector/huggingface/v0/config/definition.json). 36 37 ## Connection 38 39 | Field | Field ID | Type | Note | 40 | :--- | :--- | :--- | :--- | 41 | API Key (required) | `api_key` | string | Fill your Hugging face API token. To find your token, visit https://huggingface.co/settings/tokens. | 42 | Base URL (required) | `base_url` | string | Hostname for the endpoint. To use Inference API set to https://api-inference.huggingface.co, for Inference Endpoint set to your custom endpoint. | 43 | Is Custom Endpoint (required) | `is_custom_endpoint` | boolean | Fill true if you are using a custom Inference Endpoint and not the Inference API. | 44 45 ## Supported Tasks 46 47 ### Text Generation 48 49 Generating text is the task of producing new text. These models can, for example, fill in incomplete text or paraphrase. 50 51 | Input | ID | Type | Description | 52 | :--- | :--- | :--- | :--- | 53 | Task ID (required) | `task` | string | `TASK_TEXT_GENERATION` | 54 | Model (required) | `model` | string | The Hugging Face model to be used | 55 | String Input (required) | `inputs` | string | String input | 56 | Parameters | `parameters` | object | Parameters | 57 | Options | `options` | object | Options for the model | 58 59 | Output | ID | Type | Description | 60 | :--- | :--- | :--- | :--- | 61 | Generated Text | `generated_text` | string | The continuated string | 62 63 ### Fill Mask 64 65 Masked language modeling is the task of masking some of the words in a sentence and predicting which words should replace those masks. 66 67 | Input | ID | Type | Description | 68 | :--- | :--- | :--- | :--- | 69 | Task ID (required) | `task` | string | `TASK_FILL_MASK` | 70 | Model (required) | `model` | string | The Hugging Face model to be used | 71 | String Input (required) | `inputs` | string | a string to be filled from, must contain the [MASK] token (check model card for exact name of the mask) | 72 | Options | `options` | object | Options for the model | 73 74 | Output | ID | Type | Description | 75 | :--- | :--- | :--- | :--- | 76 | Results | `results` | array[object] | Results | 77 78 ### Summarization 79 80 Summarization is the task of producing a shorter version of a document while preserving its important information. 81 82 | Input | ID | Type | Description | 83 | :--- | :--- | :--- | :--- | 84 | Task ID (required) | `task` | string | `TASK_SUMMARIZATION` | 85 | Model (required) | `model` | string | The Hugging Face model to be used | 86 | String Input (required) | `inputs` | string | String input | 87 | Parameters | `parameters` | object | Parameters | 88 | Options | `options` | object | Options for the model | 89 90 | Output | ID | Type | Description | 91 | :--- | :--- | :--- | :--- | 92 | Summary Text | `summary_text` | string | The string after summarization | 93 94 ### Text Classification 95 96 Text Classification is the task of assigning a label or class to a given text. 97 98 | Input | ID | Type | Description | 99 | :--- | :--- | :--- | :--- | 100 | Task ID (required) | `task` | string | `TASK_TEXT_CLASSIFICATION` | 101 | Model (required) | `model` | string | The Hugging Face model to be used | 102 | String Input (required) | `inputs` | string | String input | 103 | Options | `options` | object | Options for the model | 104 105 | Output | ID | Type | Description | 106 | :--- | :--- | :--- | :--- | 107 | Results | `results` | array[object] | Results | 108 109 ### Token Classification 110 111 Token classification is a natural language understanding task in which a label is assigned to some tokens in a text. 112 113 | Input | ID | Type | Description | 114 | :--- | :--- | :--- | :--- | 115 | Task ID (required) | `task` | string | `TASK_TOKEN_CLASSIFICATION` | 116 | Model (required) | `model` | string | The Hugging Face model to be used | 117 | String Input (required) | `inputs` | string | String input | 118 | Parameters | `parameters` | object | Parameters | 119 | Options | `options` | object | Options for the model | 120 121 | Output | ID | Type | Description | 122 | :--- | :--- | :--- | :--- | 123 | Results | `results` | array[object] | Results | 124 125 ### Translation 126 127 Translation is the task of converting text from one language to another. 128 129 | Input | ID | Type | Description | 130 | :--- | :--- | :--- | :--- | 131 | Task ID (required) | `task` | string | `TASK_TRANSLATION` | 132 | Model (required) | `model` | string | The Hugging Face model to be used | 133 | String Input (required) | `inputs` | string | String input | 134 | Options | `options` | object | Options for the model | 135 136 | Output | ID | Type | Description | 137 | :--- | :--- | :--- | :--- | 138 | Translation Text | `translation_text` | string | The string after translation | 139 140 ### Zero Shot Classification 141 142 Zero-shot text classification is a task in natural language processing where a model is trained on a set of labeled examples but is then able to classify new examples from previously unseen classes. 143 144 | Input | ID | Type | Description | 145 | :--- | :--- | :--- | :--- | 146 | Task ID (required) | `task` | string | `TASK_ZERO_SHOT_CLASSIFICATION` | 147 | Model (required) | `model` | string | The Hugging Face model to be used | 148 | String Input (required) | `inputs` | string | String input | 149 | Parameters | `parameters` | object | Parameters | 150 | Options | `options` | object | Options for the model | 151 152 | Output | ID | Type | Description | 153 | :--- | :--- | :--- | :--- | 154 | Scores | `scores` | array[number] | a list of floats that correspond the the probability of label, in the same order as labels. | 155 | Labels | `labels` | array[string] | The list of strings for labels that you sent (in order) | 156 | Sequence (optional) | `sequence` | string | The string sent as an input | 157 158 ### Question Answering 159 160 Question Answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document. 161 162 | Input | ID | Type | Description | 163 | :--- | :--- | :--- | :--- | 164 | Task ID (required) | `task` | string | `TASK_QUESTION_ANSWERING` | 165 | Model (required) | `model` | string | The Hugging Face model to be used | 166 | Inputs (required) | `inputs` | object | Inputs | 167 | Options | `options` | object | Options for the model | 168 169 | Output | ID | Type | Description | 170 | :--- | :--- | :--- | :--- | 171 | Answer | `answer` | string | A string that’s the answer within the text. | 172 | Stop (optional) | `stop` | integer | The index (string wise) of the stop of the answer within context. | 173 | Score (optional) | `score` | number | A float that represents how likely that the answer is correct | 174 | Start (optional) | `start` | integer | The index (string wise) of the start of the answer within context. | 175 176 ### Table Question Answering 177 178 Table Question Answering (Table QA) is the answering a question about an information on a given table. 179 180 | Input | ID | Type | Description | 181 | :--- | :--- | :--- | :--- | 182 | Task ID (required) | `task` | string | `TASK_TABLE_QUESTION_ANSWERING` | 183 | Model (required) | `model` | string | The Hugging Face model to be used | 184 | Inputs (required) | `inputs` | object | Inputs | 185 | Options | `options` | object | Options for the model | 186 187 | Output | ID | Type | Description | 188 | :--- | :--- | :--- | :--- | 189 | Aggregator (optional) | `aggregator` | string | The aggregator used to get the answer | 190 | Answer | `answer` | string | The plaintext answer | 191 | Cells (optional) | `cells` | array[string] | a list of coordinates of the cells contents | 192 | Coordinates (optional) | `coordinates` | array[array] | a list of coordinates of the cells referenced in the answer | 193 194 ### Sentence Similarity 195 196 Sentence Similarity is the task of determining how similar two texts are. 197 198 | Input | ID | Type | Description | 199 | :--- | :--- | :--- | :--- | 200 | Task ID (required) | `task` | string | `TASK_SENTENCE_SIMILARITY` | 201 | Model (required) | `model` | string | The Hugging Face model to be used | 202 | Inputs (required) | `inputs` | object | Inputs | 203 | Options | `options` | object | Options for the model | 204 205 | Output | ID | Type | Description | 206 | :--- | :--- | :--- | :--- | 207 | Scores | `scores` | array[number] | The associated similarity score for each of the given strings | 208 209 ### Conversational 210 211 Conversational response modelling is the task of generating conversational text that is relevant, coherent and knowledgable given a prompt. 212 213 | Input | ID | Type | Description | 214 | :--- | :--- | :--- | :--- | 215 | Task ID (required) | `task` | string | `TASK_CONVERSATIONAL` | 216 | Model (required) | `model` | string | The Hugging Face model to be used | 217 | Inputs (required) | `inputs` | object | Inputs | 218 | Parameters | `parameters` | object | Parameters | 219 | Options | `options` | object | Options for the model | 220 221 | Output | ID | Type | Description | 222 | :--- | :--- | :--- | :--- | 223 | Conversation (optional) | `conversation` | object | A facility dictionnary to send back for the next input (with the new user input addition). | 224 | Generated Text | `generated_text` | string | The answer of the bot | 225 226 ### Image Classification 227 228 Image classification is the task of assigning a label or class to an entire image. 229 230 | Input | ID | Type | Description | 231 | :--- | :--- | :--- | :--- | 232 | Task ID (required) | `task` | string | `TASK_IMAGE_CLASSIFICATION` | 233 | Model (required) | `model` | string | The Hugging Face model to be used | 234 | Image (required) | `image` | string | The image file | 235 236 | Output | ID | Type | Description | 237 | :--- | :--- | :--- | :--- | 238 | Classes | `classes` | array[object] | Classes | 239 240 ### Image Segmentation 241 242 Image Segmentation divides an image into segments where each pixel in the image is mapped to an object. 243 244 | Input | ID | Type | Description | 245 | :--- | :--- | :--- | :--- | 246 | Task ID (required) | `task` | string | `TASK_IMAGE_SEGMENTATION` | 247 | Model (required) | `model` | string | The Hugging Face model to be used | 248 | Image (required) | `image` | string | The image file | 249 250 | Output | ID | Type | Description | 251 | :--- | :--- | :--- | :--- | 252 | Segments | `segments` | array[object] | Segments | 253 254 ### Object Detection 255 256 Object Detection models allow users to identify objects of certain defined classes. 257 258 | Input | ID | Type | Description | 259 | :--- | :--- | :--- | :--- | 260 | Task ID (required) | `task` | string | `TASK_OBJECT_DETECTION` | 261 | Model (required) | `model` | string | The Hugging Face model to be used | 262 | Image (required) | `image` | string | The image file | 263 264 | Output | ID | Type | Description | 265 | :--- | :--- | :--- | :--- | 266 | Objects | `objects` | array[object] | Objects | 267 268 ### Image To Text 269 270 Image to text models output a text from a given image. 271 272 | Input | ID | Type | Description | 273 | :--- | :--- | :--- | :--- | 274 | Task ID (required) | `task` | string | `TASK_IMAGE_TO_TEXT` | 275 | Model (required) | `model` | string | The Hugging Face model to be used | 276 | Image (required) | `image` | string | The image file | 277 278 | Output | ID | Type | Description | 279 | :--- | :--- | :--- | :--- | 280 | Text | `text` | string | Generated text | 281 282 ### Speech Recognition 283 284 Automatic Speech Recognition (ASR), also known as Speech to Text (STT), is the task of transcribing a given audio to text. 285 286 | Input | ID | Type | Description | 287 | :--- | :--- | :--- | :--- | 288 | Task ID (required) | `task` | string | `TASK_SPEECH_RECOGNITION` | 289 | Model (required) | `model` | string | The Hugging Face model to be used | 290 | Audio (required) | `audio` | string | The audio file | 291 292 | Output | ID | Type | Description | 293 | :--- | :--- | :--- | :--- | 294 | Text | `text` | string | The string that was recognized within the audio file. | 295 296 ### Audio Classification 297 298 Audio classification is the task of assigning a label or class to a given audio. 299 300 | Input | ID | Type | Description | 301 | :--- | :--- | :--- | :--- | 302 | Task ID (required) | `task` | string | `TASK_AUDIO_CLASSIFICATION` | 303 | Model (required) | `model` | string | The Hugging Face model to be used | 304 | Audio (required) | `audio` | string | The audio file | 305 306 | Output | ID | Type | Description | 307 | :--- | :--- | :--- | :--- | 308 | Classes | `classes` | array[object] | Classes |