Jaxon uses AI to build and train machine learning models for text classifiers. Jaxon automation improves training by adding labels, extracting useful features and pretraining artifacts, testing architectures and parameters, and incorporating AI-based and human-driven sources of domain knowledge.
Taking Human Labelers out of the Loop
What used to take months to do now takes minutes!
People can be creative powerhouses, but when it comes to simple repetitive tasks, machines rule the day. Modern manufacturing relies on people to design the factory, but assembly lines are automated - this makes them cheaper, faster, and more consistent.
The same goes for machine learning. People must be involved with training pipeline design, but they shouldn’t be a part of the pipeline.
With Jaxon, users design the factory; they don’t work on the assembly line!
Given a classification algorithm, there is a direct correlation between the number of labeled examples provided for training and the resulting accuracy of the classification model, but labeled training data is difficult and expensive to acquire at scale.
Jaxon augments a small number of manual Ground Truth labels (approximately 1% of the overall dataset) to synthesize labels for the remainder of the training data, thus creating a larger training dataset. While a training dataset comprised of 100% Ground Truth labels is ideal for optimal F1, incorporating synthetic labels dramatically reduces the number of manual labels needed for a classifier and still yields an ultimate classifier accuracy comparable to that of a classifier trained on a 100% Ground Truth dataset.
Substantial labeling headway can often be made using simple "common sense" heuristics that represent human knowledge while requiring minimal human involvement. Jaxon leverages ensembling techniques, such as Snorkel, to appropriately combine human-provided heuristics in the form of regular expressions with cutting-edge machine learning models. These work together to label large datasets automatically and improve training outcomes. Our hybrid approach allows human understanding to augment machine learning and let the machines do the heavy lifting, resulting in a powerful combination.
Neural Architecture Design
Jaxon provides a common interface to rapidly train, hyperparameterize, and evaluate neural text classifiers. Users can craft and apply problem-specific data and knowledge without writing code, view intermediate results, and make adjustments along the way. Jaxon enables users to focus on the high-level creative parts of model design, not the plumbing.
Jaxon leverages state-of-the-art architectures such as:
As new algorithms and architectures are discovered, Jaxon’s patent-pending technology incorporates them.
Human-produced training labels are dirty. The old adage “Garbage In, Garbage Out” applies well to data science. Training and validating against human-produced labels can culminate in misleading, dirty results.
Jaxon internalized label quality:
Noise reduction techniques help utilize Gold labels to improve the training utility of Silver and Bronze labels. Jaxon strategically utilizes different label classes to maximize the signal available throughout all available data while minimizing the noise in Silver and Bronze labels.
Pretraining fits parameters (weights) to a large, unlabeled dataset in order to gain general skills in some problem domain like computer vision or NLP. Once initial training has been completed, it is then reused to solve other specific tasks within the same problem domain. This comprises a form of transfer learning, allowing a large amount of unlabeled data to support a much smaller labeled dataset.
To solve a target task, pretrained models can either be fine-tuned on datasets particular to the new task at hand (e.g. classification), or transformed wherein the pretrained model remains fixed and its output is used as the input to a new model trained on the new dataset. When fine-tuning, the pretrained model parameters are treated as initial values for the new task-specific model. When transforming, the output of the pretrained model serves as abstracted feature vectors when training the separate, and usually different, model that solves the new task.
In addition to utilizing pretrained language models like the BERT family, Jaxon enables refinement of a pretrained model by continuing pretraining with problem-specific datasets. This task-specific pretraining helps extract the nuances of domain-specific language: specialized vocabulary, syntax, and semantics. This enhances the learning speed and raises the ultimate performance ceiling for a neural model.
Custom Training Schedules
Deep Neural Networks are trained over several epochs. During each epoch, the model processes each example in the training dataset and a loss function is calculated. Typically, training context remains the same throughout all epochs: parameters, dataset, loss function, layer management, feature vocabulary, etc. When the loss has converged, training stops.
Jaxon implements custom multi-stage Training Schedules that support changing context:
How you train is just as important as what you train.
Analyze results with Jaxon. Model evaluation begins with an F1 score, but with Jaxon, users can visualize the data, identify areas of weakness, and iterate to improve performance.
Jaxon fits a wide variety of use cases and solutions - see a few here:
© Copyright 2020. All rights reserved.