Amid rolled eyes, shrugged shoulders, jazzed hands and warbling vocal inflection, it’s not hard to tell when someone will be teased as they face you. Online, however, you have to that SpongeBob meme and a liberal application of the transfer key to cross your point of contention. Lucky us netizens, DARPA Information Change Information (I2O) collaborated with researchers from the University of Central Florida to develop in-depth AI learning that enables the understanding of written sarcasm with a surprising degree of accuracy.
“Due to the high speed and abundance of social media data, companies rely on tools to analyze data and to provide customer service. These tools perform tasks such as content management, analysis of sentiment, and capture relevant messages for the company’s customer service representatives to respond to, ”UCF Associate Professor of Industrial Engineering and Management Systems, Dr. Ivan Garibay, told Engadget via email. “However, these tools are not sophisticated enough to identify more nationalistic forms of speech such as sarcasm or humor, where the meaning of a message is not always obvious and clear. It imposes an additional burden. burden on the social media team, filled with water with customer messages to identify these messages and respond correctly. ”
As they explain in a study published in the journal, Entropy, Garibay and UCF PhD student Ramya Akula built “an interpretable in-depth learning model that uses multi-headed self-care and also gated units. The multi-head self-attention module helps identify critical sarcastic cue-words from entering, and repetitive units detect the distances between these cue-words in order to the input text can be better classified.
“Essentially, the researchers’ approach focused on discovering textual patterns that exhibit sarcasm, ”Drs. Brian Kettler, an I2O program manager who manages SocialSim program, explained in a later statement of the statement. “It identifies cue-words and their relation to other words as representative of sarcastic expressions or statements.”
The team approach is different methods used on previous efforts to use machines to see Twitter mockery. “The older way to approach it is to sit there and describe the shapes we see,” Kettler told Engadget, “perhaps, linguists’ theories about what dictates the language ”or marking marks from the context of the sentence, such as Amazon’s pointless positive review of a product panned by the majority. The model also learned to pay attention to specific words and punctuation such as fairness, too, perfect, and “!” once it was noticed by them. “These are the words in the sentence that are blasphemous and, as expected, they have received much higher attention than the others,” the researchers wrote.
For this project, the researchers used a different group of data from Twitter, Reddit, The Onion, Huffpost and Sarcasm Corpus V2 Dialogues from Internet Argument Corpus. “That’s the beauty of this approach, you need training examples,” Kettler said. “That’s enough, and the system will know what forms of input text are predicted to be sarcastic in speech.”
This model also provides a degree of transparency into the decision-making process that is not often seen in in-depth study of AI models like this one. Sarcasm AI will actually show users what linguistic features are known and thought to be important in a given sentence through the attention mechanisms they display (below)
The accuracy and precision of the system is even more impressive. In the Twitter dataset, the model marked a F1 score at 98.7 (8.7 points higher than its closest rival) while, on the Reddit dataset, it scored 81.0 – 4 points higher than the competition. In the news headlines, it scored 91.8, more than five points ahead of both recognition systems, even if it appeared to struggle slightly with the Dialogues (F1 only hit 77.2).
As the model is further developed, it could be a valuable tool for both the public and private sectors. Kettler sees this AI as suitable for even more missions in the SocialSim program. “It’s part of what we’re mostly doing, really looking at and understanding around the information online,” he said, trying to figure out “high-level participation. [and] how many people can share any different information. ”
For example, if the NIH or CDC promotes a public health campaign and asks for online feedback, the promoters are quick to dismiss the overall public opinion of the campaign after sarcastic responses from trolls and shitposters. already filtered.
“We want to understand the sentiment,” he continued. “Where people are involved, are people who like something or don’t like it, and ridicule can be deceptive in analyzing sentiment. what we find online. “
The UCF team has plans to further improve the model so that it can be used for languages other than English before opening up the code startup. Although Garibay notes that a potential sticking point is their ability to create “high -quality multiple datasets in multiple languages. Then the next big challenge is to manage ambiguities, colloquialism, slang, etc. and coping with the evolution of language. “
All products recommended by Engadget are selected by our editorial team, which is independent of our parent company. Some of our stories have accompanying membership links. If you buy something through one of the links, we can get a co -commission.