Frequently Asked Questions

What, exactly, does Jaxon do?

Jaxon is a semi-supervised data labeling solution that bootstraps a small number of human-provided labels into full-scale training datasets for text-oriented machine learning applications. Jaxon incorporates knowledge from large unsupervised corpora, automatically parameterizes its data processing pipelines to fit specific datasets, and ensembles previously-created labeling models to new data labeling tasks. Compared with traditional approaches, Jaxon is dramatically faster (minutes vs months), is a fraction of the cost, introduces consistency, removes bias and human error, and enables online learning scenarios.

What’s Jaxon’s accuracy? Will it improve my model’s accuracy?

Jaxon utilizes an F1 score as an optimization metric, and will display an estimated F1 based on sampling of the training set. Training sets and their inherent accuracy are determined by the amount, quality, and breadth of data imported into Jaxon as it relates to the downstream model and use case. The best results are achieved when provided ground-truth labels are balanced and the training set - including unlabeled examples - is representative of the overall dataset. Jaxon’s synthetic labels (generated from the training set) increase downstream machine learning model and classifier accuracy and even better results can be achieved by using Jaxon iteratively to refine and enhance training data.

How does Jaxon handle domain-specific phrases/terminology?

This is one of the main advantages of using Jaxon: all words are incorporated easily and efficiently. During the training phase, Jaxon ingests every word found in the training corpus, including any domain-specific language, slang, or common abbreviations and misspellings, and incorporates them into its label generation model(s).

So what do I use Jaxon labels for?

Machine Learning models, and Deep Learning models in particular, require training sets that contain millions to billions of examples, and it takes months and massive amounts of manpower to get them labeled. Jaxon replaces these costly, slow, error-prone, inconsistent human labelers by automating the data labeling process. Jaxon’s output feeds into predictive models and classifiers as training data.

With Jaxon, machine learning applications make more accurate classifications and predictions.

How do we deploy/host data?

Jaxon is deployed in the form of a Docker stack and can be deployed on premises or in the cloud.

Does Jaxon support other languages?

Jaxon is generally language agnostic and learns from statistical patterns discovered in a corpus rather than from the actual language(s) contained in the corpus. We currently support English, Spanish, German, French, Italian, Portuguese, and Dutch. It is possible to certify Jaxon for other languages as needed.

What are some specific use cases for Jaxon?

Jaxon labels can be used for any task that requires natural language processing. Some use cases that have seen great success are:

  • Monitoring social media, news, and other sources in order to assess trends

  • Training natural language understanding (NLU) models to support chatbots and conversational AI systems

  • Triaging trouble tickets for IT and customer support

  • Real-time bidding and programmatic marketing

  • Document classification for governance and compliance