Applying Emotion and Sentiment Analysis in IoT Applications
Leor GreblerLeor Grebler
If you knew your device was making users unhappy while they were using it, would you do something to change how they feel?
The new opportunities in AI come down to affecting user behavior. Already, Facebook, Google, and Amazon, among others, deploy armies of scientists to keep us clicking, scrolling, and engaged with their ad funnel. And then there's the whole politics of influence, which brings up an interesting development - the same AI tools that are available to these larger companies have now been extended to anyone who wants to use them for their own use.
App and web developers can now use these, but the most impactful applications exist on IoT devices. The impact happens because we are more influenced by physical things: haptics, colors, sounds, smells, heat, and movement. These can't be replicated in apps.
While one might conjure up some runaway AI using these to manipulate humanity to do its bidding, if we're transparent with users, we can potentially nudge them towards their stated goals using our devices and AI.
If our goal is to improve an index such as user happiness, we first need to measure it. Today, there are several tools available to do this that are getting even better than humans:
A savvy UX developer might then set the gender of the text-to-speech engine, an accent, cadence, or other speech features to match those of the user. This can have the effect of putting the user more at ease. Also, by identifying the user who's present, it's possible to also tailor the content specifically to that user, or to load their profile.
Companies that offer APIs to do identification and classification include Microsoft, Alchemy, Kaggle, and others. Their business models vary from micropennies per API call, to a flat fee, to a per-device license.
IBM Watson is one such service. If you feed it text, it will return back multiple aspects of the person's use of language and personality:
One of the limitations of these services is the amount of text needed to do the analysis. In terms of Watson, they need at least 100 words. This is typically longer than what someone would use in commanding a device by voice or a typical input.
There are a few ways a device maker could address this. First, there's a way that could make users' a bit uneasy: continuously recording and transcribing the conversation. The limitation of this method would be that the service would also need to diarize the conversation if there were more than one person speaking and typically continuous transcription is prone to error. Other slightly less creepy sources are voicemail transcriptions or voice messages on apps like Whatsapp.
Another method is to accumulate the utterances over time and then send them for sentiment analysis once a minimum length has been reached. The plus side is that this is fairly easy to implement. The downside is that it doesn't provide real time analysis and because both the time in between samples and the content and context can be different, the analysis might be skewed.
The other approach is to fuse sentiment data from other sources with the voice interaction. For example, if someone has just sent an angry text message or written a loving email, we could get a clear idea into their state of mind and then tune a response to a voice request as a result of this knowledge.
[embed]https://www.youtube.com/watch?v=aOcpxUChGBE[/embed]
Today, there are a few other companies that can do this as an API as well as through embedded software. These include Affectiva, EmoVoice, and Vokaturi. In addition to voice, APIs now include machine learning vision to provide realtime emotion data and provide personality information.
Bing, for example, provides both age, gender, and emotion based on facial analysis. Any device with a camera could take stills, upload them to the API, and continuously feed the information back to any apps running in parallel on the device. Perhaps there could be triggers based on negative emotion detection?
The first application is matching. When I was doing cold calling to southern Kentucky, I'd occasionally catch myself adopting a drawl and slowing down my speech when prospects answered the phone. The subconscious effort was to make myself more relatable to the person I was speaking with.
There isn't a big barrier for AIs to detect and do the same thing in our voice interactions. Cadence, gender, and tone can be matched very quickly. Based on sentiment analysis, we can also adapt the terseness of the interaction. Are the user's responses short? Then our responses should be short as well.
The second application is creating reactions to negative emotions. When detecting negative emotions, a system can try various responses to mitigate the negativity:
There is an opportunity for developers to create automated responses based on emotion and to start layering on adaptations as we know about the user's state of mind. This is where we can apply machine learning to understand which adaptations have the most impact on the user's state of mind.
The Most Comprehensive IoT Newsletter for Enterprises
Showcasing the highest-quality content, resources, news, and insights from the world of the Internet of Things. Subscribe to remain informed and up-to-date.
New Podcast Episode
Recent Articles