# TFLite Task library - C++ A flexible and ready-to-use library for common machine learning model types, such as classification and detection. ## Text Task Libraries ### QuestionAnswerer `QuestionAnswerer` API is able to load [Mobile BERT](https://tfhub.dev/tensorflow/mobilebert/1) or [AlBert](https://tfhub.dev/tensorflow/albert_lite_base/1) TFLite models and answer question based on context. Use the C++ API to answer questions as follows: ```cc using tflite::task::text::qa::BertQuestionAnswerer; using tflite::task::text::qa::QaAnswer; // Create API handler with Mobile Bert model. auto qa_client = BertQuestionAnswerer::CreateBertQuestionAnswererFromFile("/path/to/mobileBertModel", "/path/to/vocab"); // Or create API handler with Albert model. // auto qa_client = BertQuestionAnswerer::CreateAlbertQuestionAnswererFromFile("/path/to/alBertModel", "/path/to/sentencePieceModel"); std::string context = "Nikola Tesla (Serbian Cyrillic: Никола Тесла; 10 " "July 1856 – 7 January 1943) was a Serbian American inventor, electrical " "engineer, mechanical engineer, physicist, and futurist best known for his " "contributions to the design of the modern alternating current (AC) " "electricity supply system."; std::string question = "When was Nikola Tesla born?"; // Run inference with `context` and a given `question` to the context, and get top-k // answers ranked by logits. const std::vector answers = qa_client->Answer(context, question); // Access QaAnswer results. for (const QaAnswer& item : answers) { std::cout << absl::StrFormat("Text: %s logit=%f start=%d end=%d", item.text, item.pos.logit, item.pos.start, item.pos.end) << std::endl; } // Output: // Text: 10 July 1856 logit=16.8527 start=17 end=19 // ... (and more) // // So the top-1 answer is: "10 July 1856". ``` In the above code, `item.text` is the text content of an answer. We use a span with closed interval `[item.pos.start, item.pos.end]` to denote predicted tokens in the answer, and `item.pos.logit` is the sum of span logits to represent the confidence score. ### NLClassifier `NLClassifier` API is able to load any TFLite models for natural language classaification task such as language detection or sentiment detection. The API expects a TFLite model with the following input/output tensor: Input tensor0: (kTfLiteString) - input of the model, accepts a string. Output tensor0: (kTfLiteUInt8/kTfLiteInt8/kTfLiteInt16/kTfLiteFloat32/kTfLiteFloat64) - output scores for each class, if type is one of the Int types, dequantize it to double Output tensor1: optional (kTfLiteString) - output classname for each class, should be of the same length with scores. If this tensor is not present, the API uses score indices as classnames. By default the API tries to find the input/output tensors with default configurations in NLClassifierOptions, with tensor name prioritized over tensor index. The option is configurable for different TFLite models. Use the C++ API to perform language ID classification as follows: ```cc using tflite::task::text::nlclassifier::NLClassifier; using tflite::task::core::Category; auto classifier = NLClassifier::CreateFromFileAndOptions("/path/to/model"); // Or create a customized NLClassifierOptions // NLClassifierOptions options = // { // .output_score_tensor_name = myOutputScoreTensorName, // .output_label_tensor_name = myOutputLabelTensorName, // } // auto classifier = NLClassifier::CreateFromFileAndOptions("/path/to/model", options); std::string context = "What language is this?"; std::vector categories = classifier->Classify(context); // Access category results. for (const Categoryr& category : categories) { std::cout << absl::StrFormat("Language: %s Probability: %f", category.class_name, category_.score) << std::endl; } // Output: // Language: en Probability=0.9 // ... (and more) // // So the top-1 answer is 'en'. ``` ## Vision Task Libraries ### Image Classifier `ImageClassifier` accepts any TFLite image classification model (with optional, but strongly recommended, TFLite Model Metadata) that conforms to the following spec: Input tensor (type: `kTfLiteUInt8` / `kTfLiteFloat32`): - image input of size `[batch x height x width x channels]`. - batch inference is not supported (`batch` is required to be 1). - only RGB inputs are supported (`channels` is required to be 3). - if type is `kTfLiteFloat32`, `NormalizationOptions` are required to be attached to the metadata for input normalization. At least one output tensor (type: `kTfLiteUInt8` / `kTfLiteFloat32`) with: - `N` classes and either 2 or 4 dimensions, i.e. `[1 x N]` or `[1 x 1 x 1 x N]` - optional (but recommended) label map(s) as AssociatedFile-s with type TENSOR_AXIS_LABELS, containing one label per line. The first such AssociatedFile (if any) is used to fill the `class_name` field of the results. The `display_name` field is filled from the AssociatedFile (if any) whose locale matches the `display_names_locale` field of the `ImageClassifierOptions` used at creation time ("en" by default, i.e. English). If none of these are available, only the `index` field of the results will be filled. An example of such model can be found at: https://tfhub.dev/bohemian-visual-recognition-alliance/lite-model/models/mushroom-identification_v1/1 Example usage: ```cc // More options are available (e.g. max number of results to return). At the // very least, the model must be specified: ImageClassifierOptions options; options.mutable_model_file_with_metadata()->set_file_name( "/path/to/model.tflite"); // Create an ImageClassifier instance from the options. StatusOr> image_classifier_or = ImageClassifier::CreateFromOptions(options); // Check if an error occurred. if (!image_classifier_or.ok()) { std::cerr << "An error occurred during ImageClassifier creation: " << image_classifier_or.status().message(); return; } std::unique_ptr image_classifier = std::move(image_classifier_or.value()); // Prepare FrameBuffer input from e.g. image RGBA data, width and height: std::unique_ptr frame_buffer = CreateFromRgbaRawBuffer(image_rgba_data, {image_width, image_height}); // Run inference: StatusOr result_or = image_classifier->Classify(*frame_buffer); // Check if an error occurred. if (!result_or.ok()) { std::cerr << "An error occurred during classification: " << result_or.status().message(); return; } ClassificationResult result = result_or.value(); // Example value for 'result': // // classifications { // classes { index: 934 score: 0.95 class_name: "cat" } // classes { index: 948 score: 0.007 class_name: "dog" } // classes { index: 927 score: 0.003 class_name: "fox" } // head_index: 0 // } ``` A CLI demo tool is also available [here][1] for easily trying out this API. ### Object Detector `ObjectDetector` accepts any object detection TFLite model (with mandatory TFLite Model Metadata) that conforms to the following spec (e.g. Single Shot Detectors): Input tensor (type: `kTfLiteUInt8` / `kTfLiteFloat32`): - image input of size `[batch x height x width x channels]`. - batch inference is not supported (`batch` is required to be 1). - only RGB inputs are supported (`channels` is required to be 3). - if type is kTfLiteFloat32, `NormalizationOptions` are required to be attached to the metadata for input normalization. Output tensors must be the 4 outputs (type: `kTfLiteFloat32`) of a [`DetectionPostProcess`][2] op, i.e: * Locations: - of size `[num_results x 4]`, the inner array representing bounding boxes in the form [top, left, right, bottom]. - BoundingBoxProperties are required to be attached to the metadata and must specify type=BOUNDARIES and coordinate_type=RATIO. * Classes: - of size `[num_results]`, each value representing the integer index of a class. - optional (but recommended) label map(s) can be attached as AssociatedFile-s with type TENSOR_VALUE_LABELS, containing one label per line. The first such AssociatedFile (if any) is used to fill the `class_name` field of the results. The `display_name` field is filled from the AssociatedFile (if any) whose locale matches the `display_names_locale` field of the `ObjectDetectorOptions` used at creation time ("en" by default, i.e. English). If none of these are available, only the `index` field of the results will be filled. * Scores: - of size `[num_results]`, each value representing the score of the detected object. * Number of results: - integer `num_results` as a tensor of size `[1]` An example of such model can be found at: https://tfhub.dev/google/lite-model/object_detection/mobile_object_localizer_v1/1/metadata/1 Example usage: ```cc // More options are available (e.g. max number of results to return). At the // very least, the model must be specified: ObjectDetectorOptions options; options.mutable_model_file_with_metadata()->set_file_name( "/path/to/model.tflite"); // Create an ObjectDetector instance from the options. StatusOr> object_detector_or = ObjectDetector::CreateFromOptions(options); // Check if an error occurred. if (!object_detector_or.ok()) { std::cerr << "An error occurred during ObjectDetector creation: " << object_detector_or.status().message(); return; } std::unique_ptr object_detector = std::move(object_detector_or.value()); // Prepare FrameBuffer input from e.g. image RGBA data, width and height: std::unique_ptr frame_buffer = CreateFromRgbaRawBuffer(image_rgba_data, {image_width, image_height}); // Run inference: StatusOr result_or = object_detector->Detect(*frame_buffer); // Check if an error occurred. if (!result_or.ok()) { std::cerr << "An error occurred during detection: " << result_or.status().message(); return; } DetectionResult result = result_or.value(); // Example value for 'result': // // detections { // bounding_box { // origin_x: 54 // origin_y: 398 // width: 393 // height: 196 // } // classes { index: 16 score: 0.65 class_name: "cat" } // } // detections { // bounding_box { // origin_x: 602 // origin_y: 157 // width: 394 // height: 447 // } // classes { index: 17 score: 0.45 class_name: "dog" } // } ``` A CLI demo tool is available [here][3] for easily trying out this API. ### Image Segmenter `ImageSegmenter` accepts any TFLite model (with optional, but strongly recommended, TFLite Model Metadata) that conforms to the following spec: Input tensor (type: `kTfLiteUInt8` / `kTfLiteFloat32`): - image input of size `[batch x height x width x channels]`. - batch inference is not supported (`batch` is required to be 1). - only RGB inputs are supported (`channels` is required to be 3). - if type is kTfLiteFloat32, `NormalizationOptions` are required to be attached to the metadata for input normalization. Output tensor (type: `kTfLiteUInt8` / `kTfLiteFloat32`): - tensor of size `[batch x mask_height x mask_width x num_classes]`, where `batch` is required to be 1, `mask_width` and `mask_height` are the dimensions of the segmentation masks produced by the model, and `num_classes` is the number of classes supported by the model. - optional (but recommended) label map(s) can be attached as AssociatedFile-s with type TENSOR_AXIS_LABELS, containing one label per line. The first such AssociatedFile (if any) is used to fill the `class_name` field of the results. The `display_name` field is filled from the AssociatedFile (if any) whose locale matches the `display_names_locale` field of the `ImageSegmenterOptions` used at creation time ("en" by default, i.e. English). If none of these are available, only the `index` field of the results will be filled. An example of such model can be found at: https://tfhub.dev/tensorflow/lite-model/deeplabv3/1/metadata/1 Example usage: ```cc // More options are available to select between return a single category mask // or multiple confidence masks during post-processing. ImageSegmenterOptions options; options.mutable_model_file_with_metadata()->set_file_name( "/path/to/model.tflite"); // Create an ImageSegmenter instance from the options. StatusOr> image_segmenter_or = ImageSegmenter::CreateFromOptions(options); // Check if an error occurred. if (!image_segmenter_or.ok()) { std::cerr << "An error occurred during ImageSegmenter creation: " << image_segmenter_or.status().message(); return; } std::unique_ptr immage_segmenter = std::move(image_segmenter_or.value()); // Prepare FrameBuffer input from e.g. image RGBA data, width and height: std::unique_ptr frame_buffer = CreateFromRgbaRawBuffer(image_rgba_data, {image_width, image_height}); // Run inference: StatusOr result_or = immage_segmenter->Segment(*frame_buffer); // Check if an error occurred. if (!result_or.ok()) { std::cerr << "An error occurred during segmentation: " << result_or.status().message(); return; } SegmentationResult result = result_or.value(); // Example value for 'result': // // segmentation { // width: 257 // height: 257 // category_mask: "\x00\x01..." // colored_labels { r: 0 g: 0 b: 0 class_name: "background" } // colored_labels { r: 128 g: 0 b: 0 class_name: "aeroplane" } // ... // colored_labels { r: 128 g: 192 b: 0 class_name: "train" } // colored_labels { r: 0 g: 64 b: 128 class_name: "tv" } // } // // Where 'category_mask' is a byte buffer of size 'width' x 'height', with the // value of each pixel representing the class this pixel belongs to (e.g. '\x00' // means "background", '\x01' means "aeroplane", etc). // 'colored_labels' provides the label for each possible value, as well as // suggested RGB components to optionally transform the result into a more // human-friendly colored image. // ``` A CLI demo tool is available [here][4] for easily trying out this API. [1]: https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/examples/task/vision/desktop/image_classifier_demo.cc [2]: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/detection_postprocess.cc [3]: https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/examples/task/vision/desktop/object_detector_demo.cc [4]: https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/examples/task/vision/desktop/image_segmenter_demo.cc