Document classification is a massive issue for organizations due to the sheer volume of files generated daily by both employees and customers. This problem is sometimes tackled using named entity recognition (NER), which works in some cases, but falls short in many others, especially for domain-specific applications.
In these situations, organizations rely on human labelers to provide the enormous amounts of training data needed for a classifier, a task that can take months and cost hundreds of thousands of dollars, especially when subject matter experts are required. Learning from the organization’s own data and domain-specific terminology, Jaxon can generate this training data using only 1% of ground truth labels and works dramatically faster than human annotators, saving on both time and cost.
Recommenders based on collaborative filtering compare object profiles for similarities. Theoretically, the closer two users are to each other profile-wise, the more likely it is that they will prefer the same products. Inversely, the closer two products are to each other, the more probable it is that a user who likes one product will like the second one.
Similarity metrics between profiles rely on readily quantifiable (and typically vectorized, at least at large scale) profile features. Any profile components that are comprised of natural language (biographies, product marketing descriptions, consumer product reviews, and so forth) present difficulty in this regard as they are less easily vectorized. Annotating these natural language profile components with Jaxon allows topical themes to emerge in a form that is straightforward to vectorize for input as training data into off-the-shelf recommendation engines.
Jaxon can train natural language understanding (NLU) models in support of chatbots and conversational AI systems. A typical NLU is a classifier that attempts to identify intents (what is a customer trying to accomplish?) and slots (easy to answer questions such as location) contained within user utterances. Training data, quickly provided by Jaxon, consists of sample utterances that are annotated to identify these intents and slots. Because Jaxon uses a company’s own data and does not require fully pre-labeled sample utterances, it is ideal for training domain-specific NLU models.
Engines providing valuable customer insight such as sentiment analysis, brand recognition, and the Voice of the Customer require large amounts of training data and sometimes fail over time as the use of language, especially on the internet, changes at a breathtaking pace. Jaxon labels identify thematic topics in provided example text, which carries enhanced predictive value for classifiers. The speed at which Jaxon can generate training data also allows the engines to be retrained over time as language change occurs.
Real-time annotation and analysis of streams such as social media posts or market news can be used to infer the occurrence of specific and specialized events. Individual events themselves may prove useful as, say, trading signals or pricing triggers, but second-order aggregations of events also carry value for trend analysis. Humans are not able to annotate text in real time at the scale needed for trend detection, but as Jaxon can handle streaming data, this type of analysis becomes possible.
In one example, an algorithmic hedge fund utilized a trading strategy that depended on “normal” market days with relatively low volatility. The presence of significant market news for their traded stocks or sectors (or related currencies, markets, etc.) represented risk factors regardless of the sentiment of that news. Event-based trend analysis allowed immediate reaction (in this case a liquidation of all trading positions) to an unusual frequency of relevant news.
Adjustment & Diversification
Although a classifier may work well in one environment, if the basic assumptions that the classifier is built on no longer apply, then the classifier will no longer perform as required. An example of this is a classifier that performs well in a ‘bear’ market, but is no longer useful in a ‘bull’ market. Jaxon provides the training data that allows the classifier to be retrained as quickly as possible as soon as an adjustment is needed.
Jaxon is also perfect for when new models are frequently required. Some such scenarios are a company expanding to different verticals or one that offers specialized services and needs to train a model for each deployment. Jaxon shortens the required training time by weeks.