Jaxon uses AI to build and train machine learning models for text classifiers. Jaxon automation improves training by adding labels, extracting useful features and pretraining artifacts, testing architectures and parameters, and incorporating AI-based and human-driven sources of domain knowledge.


Taking Human Labelers out of the Loop

What used to take months to do now takes minutes!

People can be creative powerhouses, but when it comes to simple repetitive tasks, machines rule the day. Modern manufacturing relies on people to design the factory, but assembly lines are automated - this makes them cheaper, faster, and more consistent.

The same goes for machine learning. People must be involved with training pipeline design, but they shouldn’t be a part of the pipeline.


With Jaxon, users design the factory; they don’t work on the assembly line!

Synthetic Labeling


Given a classification algorithm, there is a direct correlation between the number of labeled examples provided for training and the resulting accuracy of the classification model, but labeled training data is difficult and expensive to acquire at scale.

Jaxon augments a small number of manual Ground Truth labels (approximately 1% of the overall dataset) to synthesize labels for the remainder of the training data, thus creating a larger training dataset. While a training dataset comprised of 100% Ground Truth labels is ideal for optimal F1, incorporating synthetic labels dramatically reduces the number of manual labels needed for a classifier and still yields an ultimate classifier accuracy comparable to that of a classifier trained on a 100% Ground Truth dataset.


Custom Rules

Substantial signal can often be represented as simple “common sense” rules. Jaxon leverages ensembling techniques such as Snorkel to combine user-provided rules with machine learning models. This hybrid approach allows human knowledge to augment machine learning, resulting in a powerful combination.

Neural Architecture Design

Artificial Intelligence

Jaxon provides a common interface to rapidly train, hyperparameterize, and evaluate neural text classifiers. Users can craft and apply problem-specific data and knowledge without writing code, view intermediate results, and make adjustments along the way. Jaxon enables users to focus on the high-level creative parts of model design, not the plumbing.

Jaxon leverages state-of-the-art architectures such as:

  • GRU (Gated Recurrent Unit)
  • AWD-LSTM (Average-Stochastic Gradient Descent Weight-Dropped - Long Short-Term Memory)
  • DistilBERT (Bidirectional Encoder Representations from Transformers)

As new algorithms and architectures are discovered, Jaxon’s patent-pending technology incorporates them.

Model Calibration

Human-produced training labels are dirty. The old adage “Garbage In, Garbage Out” applies well to data science. Training and validating against human-produced labels can culminate in misleading, dirty results.

Jaxon internalized label quality:

  • Gold labels have been verified as Ground Truth by multiple humans.
  • Silver labels have been provided by a single human labeler or system of record.
  • Bronze labels are bootstrapped by Jaxon’s synthetic labeling models or by user-provided rules and heuristics.

Noise reduction techniques help utilize Gold labels to improve the training utility of Silver and Bronze labels. Jaxon strategically utilizes different label classes to maximize the signal available throughout all available data while minimizing the noise in Silver and Bronze labels.

Artificial Intelligence


In addition to utilizing pretrained language models like BERT, Jaxon enables refinement of the pretrained model by applying a schedule of custom datasets. By fine-tuning on domain-related datasets as well as on the training data itself, a very specialized vocabulary can be developed that enhances the learning speed and ultimate performance ceiling of a neural model.


Custom Training Schedules


Deep Neural Networks are trained over several epochs. During each epoch, the model processes each example in the training dataset and a loss function is calculated. Typically, training context remains the same throughout all epochs: parameters, dataset, loss function, layer management, feature vocabulary, etc. When the loss has converged, training stops.

Jaxon implements custom multi-stage Training Schedules that support changing context:

  • Application of different training datasets.
  • Synthesis of labels for unlabeled corpora.
  • Strategic layer freezing scheduled across training stage.

How you train is just as important as what you train.


Analyze results with Jaxon. Model evaluation begins with an F1 score, but with Jaxon, users can visualize the data, identify areas of weakness, and iterate to improve performance.


Jaxon fits a wide variety of use cases and solutions - see a few here:

Contact Us


Jaxon is an AI platform that trains other AI to more accurately understand natural language from raw text.

contact us
(617) 506-9410
177 Huntington Avenue
Suite 1703
Boston, Massachusetts

© Copyright 2020. All rights reserved.

Close Menu