Speech interfaces improve inclusion of users by enabling access to the services of chatbots without the dependency on a text-based interface.
WP5 will develop a proof-of-concept conversational AI system (via chatbots and speech interactions), building on research findings in WPs 2-4. Digital services using chatbots are often used to offer real-time and on-demand information to users on a 24/7 basis, also reducing the load on human customer service support while handling more general-purpose and recurring inquiries that may not require specialized expertise. Speech interfaces improve inclusion of users with limited literacy skills by enabling access to the services of chatbots without the dependency on a text-based interface.
Chatbots are rules-based software systems that use keywords and patterns in textual interactions to respond with predefined scripts that direct the flow of questions and responses, while conversational AI systems engage with users through Natural Language Processing (NLP), dialogue management, machine learning, and Natural Language Generation (NLG). Chatbots that utilize conversational AI can better handle open-ended queries, recognize the context of user intentions, and learn from prior interactions (using reinforcement learning) and knowledge to offer a more powerful and dynamic means for handling situationally-relevant inquiries, while posing meaningful questions to users to elicit feedback.
Such services can be integrated with 3rd party information systems and providers to offer comprehensive information, and maintain a history of user interactions to build on prior needs for follow-up inquiries and responses. Conversational AI systems that integrate Automatic Speech Recognition (ASR) and Speech Synthesis can offer a more natural means for multilingual interaction with a wider-range of users having diverse needs, backgrounds, and language/digital literacies to access digital services more easily.
Finally, the use of sentiment analysis can also help conversational AI systems better understand user intent and affect to handle interactions in a more responsive manner and direct users to human agents when needed. However, there are many social and ethical considerations and challenges for handling the perceptions of humanness, anthropomorphism and trust in such conversational AI systems. In particular, an element of trust is competence; competence, consistency and benevolence are central features of a system when designing for trust. Experiments with AI user-interfaces within a limited scope will by design reduce the competence of the system, such that the perceived competence and the experience of trust may also be affected.
Work package 5 will implement the outcomes from design research in work package 4 into a practical conversational digital service for evaluation. It allows migrant users to ask open-ended queries while directing them to relevant information or expert human counselors as needed. The interface will be designed to be trustworthy, such that it is built on a secure infrastructure and retains usersʼ privacy, and trust-inducing such that it both communicates the level of privacy and allows for controlling the level of privacy.
In practice, we implement the conversational interface on top of a commercial platform and services provided by our technology collaborators including IBM, Speechly, and VoxAI. We will also collect some anonymized data transcripts from interactions between customer service providers and users and recorded speech data from migrants (upon consent) to design the conversational flow of the system and evaluate how the speech recognition engine performs with the diverse accents in comparison with the wider population.
Work Package Leader:
Tom Bäckström, tom.backstrom(a)aalto.fi and Nitin Sawhney, nitin.sawhney(a)aalto.fi
Researchers:
Silas Rech
Anastasiia Chizhikova
Preetha Datta