Inbox Sentiment Sorting

Updated by Alexis Salcedo

What is Switchboard's Sentiment Sorting?

Every day users use our platform to send and receive messages, but sometimes sifting through the incoming messages can be tough and time consuming. In response, we have developed a model that assigns labels to incoming messages based on the content to help sort your inbox and surface the most important replies.

To do this, we have trained a language model that classifies messages into multiple categories as they’re received. The model doesn’t treat the categories as exclusive, so it is possible a message might have more than one label, but generally they fall into only one category.

In cases where the model is unsure, it will not assign any labels to a message. These messages can be viewed under the "Include All" filter.

Sorting Incoming Messages by Category

Our model learns the features of messages that make them more or less likely to fall into some set of the following categories:

  • Likely Relevant or Important: Messages that may need further inspection to determine what type of response is appropriate, if any.
  • Likely Negative or Unsupportive: Messages that contain harassment or unsupportive content.
  • Deceased: Messages indicating the intended text recipient is deceased.
  • Junk: Messages that are spam, automated, irrelevant, or incoherent.
  • Finally, the model also assigns the Opt Out prediction. It functions similarly to these labels, but you are able to filter by specific score ranges to help narrow your searches.
How can I use this?

You can find this in your inbox. Instead of exporting your incoming messages and searching for keywords in a spreadsheet, you can utilize our scoring system to filter the incoming messages right in Switchboard!

Open up the “Edit Filters” button, and scroll down on the menu till you find our "Reply by Message Content" and "Opt-Out Prediction" sections:

For opt outs specifically - the model rates the likelihood that a given response is an opt-out attempt, even if the respondent doesn’t use an opt-out keyword. Each message is scored on a scale from 0 to 1, with one being the highest confidence.

The inbox will only show messages that are labeled with the scores you indicate. You can see exactly your range of scores above the inbox, for example:

About Our Model:

Like any model, this model will not be perfect. Despite being accurate on average, it will make mistakes. Please feel free to ask questions about scores and we continue monitor the model’s performance over time.

Along with acute problems with modeling, some general limitations also apply to language models like this one:

  1. The model may struggle with understanding the context and nuances of certain words or phrases, especially those with multiple meanings or those used sarcastically.
  2. It may not accurately classify text that contains spelling or grammatical errors, or unexpected sentence structures.
  3. Finally, since it was trained primarily on english text, it may not handle languages correctly.
Performance:

During our model training, we perform validation of the model’s performance. Scores can give a general sense of the quality of a model but are not conclusive because they are summaries of the model over all.

Those scores, and brief explanations, can be found here:

  • We use the F1 Score of a model to represent the accuracy of the model in terms of both precision and recall. The model scored 0.8 where 1 is the best possible score.
  • The model scores 0.9 on the AUC ROC. This is a measure of the model's ability to distinguish between classes, with 1 indicating perfect classification.
  • Finally, the model correctly predicted the outcome 85% of the time in our testing data.
Training Details:

To train our model, we first start with Google’s open-source BERT model trained on Wikipedia data and a corpus of book transcripts. We then fine-tuned this model for our specific use-case by training it on hand-coded examples of messages a lot like those that you might see in your inbox.

Citation:

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR, abs/1810.04805. Retrieved from http://arxiv.org/abs/1810.0480


How did we do?