Building accessible chatbots at yellow

Jan 25, 2024

Web accessibility, commonly known as a11y, refers to the inclusive practice of designing and developing websites and web applications that can be used by people of all abilities and disabilities. This includes people with visual, auditory, physical, speech, cognitive, and neurological disabilities.

Web accessibility is crucial because it ensures everyone has equal access to information and functionality on the Internet.

One of the most important aspects of web accessibility at Yellow.ai is ensuring our chatbots are accessible. Chatbots have become increasingly popular in recent years, and they are now used by many businesses and organisations to communicate with their customers. However, if chatbots are not designed and developed with accessibility in mind, they can create barriers for people with disabilities. This can lead to frustration and exclusion, resulting in legal issues for businesses and organisations that fail to meet accessibility standards.

According to the report shared by an accessibility audit firm, in early 2023, we had approximately ~120 accessibility issues on Mac and ~90 issues on Windows. This indicates that our user experience was not up to the mark for differently able users and was not compliant with WCAG guidelines. As a result, we were not a viable option for potential customers in North America and government and educational institutions worldwide.

In this article, we will explore the challenges we faced and the journey of how we achieved WCAG compliance by improving the a11y of our chatbot interface.

Challenges

Before we start on how we built an accessible interface, let me walk you through some of the challenges we faced when we first started.

Lack of semantic HTML
In yellow.ai’s early days, the primary focus was on swift customer acquisition, so we moved fast, adding features without strict adherence to standard practices. The team, composed largely of “full-stack” engineers with limited awareness of these practices, used “divs” for everything. The team also overlooked CSS grids or flex boxes and relied on margin, padding, and position workarounds. This meant that the DOM was out of order, and VoiceOver and other accessibility tools failed to parse the content in a user-friendly way.

CleanShot 2023-12-19 at 11.03.33-20231219-053410.png — The bot icon button was created using a div without any labels or roles, making it harder for a11y tools to detect the chat widget button.

Not using the platform.
We extensively used JavaScript for various tasks in our chatbot, such as calculating the height of the chat window, adjusting the position of certain elements, and manipulating the focus order. Instead of utilizing the tools already provided by HTML and CSS, we relied solely on JavaScript. While this approach initially allowed us to implement features quickly, it ultimately led to several accessibility issues.
Our over-reliance on JavaScript meant that the focus order in the chatbot was incorrect. This caused issues for users who relied on keyboard navigation, as they could not navigate the chatbot logically and intuitively. The elements in the DOM were out of order, which caused problems for screen readers and other assistive technologies. As a result, users with disabilities could not access the chatbot's content as easily as they should have been.

Here’s how the chat container was structured before. `#chatBoxMainContainer` represents the container that wraps user and bot messages, and its height had to be calculated using JS every time a different type of component was rendered, causing performance issues. There were also too many unused nodes present in the DOM.

Customizations
Yellow allows customers to personalize the chatbot's appearance and behavior to suit their preferences. However, this customization could affect the contrast ratio, posing a challenge for visually impaired or color-blind users. Customers can also upload images and videos as bot responses, which may not have appropriate alternative texts and closed captions, which poses a challenge for visually impaired users.

Here’s an example of a bot that supports a linear gradient as a header background, which causes the text to be barely visible to the user.

Weak TTS/STT support
Our chatbot didn't have reliable Text-to-Speech (TTS) or Speech-to-Text (STT) functionalities at the beginning of our journey. This was a significant barrier for users with visual or physical disabilities, as they could not interact with the chatbot as easily as other users unless they had certain accessibility plugins installed. Users with visual impairments could not hear the responses from the chatbot without TTS, and users with physical disabilities couldn't speak to the chatbot without STT. This limited their ability to fully interact with the chatbot, which was a significant accessibility issue.

Solution

Semantic HTML and Using the platform.
We resolved 50% of our accessibility issues by refactoring our code to adhere to the semantic HTML guidelines. Using the correct HTML element for each task is important to ensure all built-in accessibility features are readily available. This approach is much better than implementing workarounds using JavaScript.

We first examined the structure of our DOM nodes and noticed that the order in which they were parsed differed from how they were displayed. Most of our UI layouts were stitched around with divs, with their heights adjusted based on other elements and positions using JavaScript. This caused performance issues and made it impossible for visually impaired users to interact with the widget. For example, the chat banner was originally positioned below the chat conversation container but must be displayed at the top. The quick-reply buttons were placed outside the chat container. To resolve this issue, we reorganized the DOM nodes using Flexbox and grids to ensure that they were in the correct order as they were displayed. By doing so, we also didn’t need to use JavaScript to adjust the heights of these elements, thereby improving the widget's performance.

We had a problem with some buttons on our website that were created using <div> instead of the proper button element. This caused issues with the focus order and made it difficult for keyboard users to interact with the widget. Additionally, buttons with icons and no text were not labeled, making it challenging for visually impaired users to know what the button did. To fix this, we added aria labels and titles to these buttons so that screen readers would provide users with information about the button's function. Using attributes like role, aria-label, aria-selected, aria-expanded, etc., we made it easy for users to interact with the widget.

Group 1.png — With this layout, we no longer need to calculate the height of the chat container every time a new message node is created.

In the above video, the user can now navigate through different buttons and links using the keyboard since we use the right HTML tags for the job

Customizations
Many customers of yellow.ai used custom color schemes for their widget UI, which led to some users being unable to interact with the widget as expected due to their color blindness. To address this issue, we limited the color customizations to the background elements and derived the text color based on these backgrounds and their font sizes. We ensured that the text and icons had the right contrast ratio, with a ratio greater than 4.5:1 for fonts less than 24px and at least 3:1 for fonts greater than 24px. This approach ensured all users could interact with the widget without accessibility issues.

CleanShot 2024-01-03 at 10.42.51@2x-20240103-051304.png — The text color inside the chat bubble is automatically generated to be compatible with A & AA level WCAG guidelines.

For images and videos uploaded by customers for bot replies, we provided an option to enter alt texts and captions. Also, we used a default text as a fallback to help the user understand what the media content was about.

TTS/STT support
To ensure that all users, regardless of their accessibility needs, can interact with our widget effectively, we introduced speech-to-text and text-to-speech features. These features are designed to assist users who do not use screen readers or other accessibility technologies but still face difficulties interacting with the site. With speech-to-text, users can speak their messages to the widget, which the widget will then translate into text and send to the bot or chat agent. Similarly, with text-to-speech, the widget will read out the messages on the widget to the user, facilitating interaction. These features enhance the user experience, making our widget more inclusive and accessible.

Here’s TTS in action below -

Here’s STT and TTS in action together below-

Testing a11y with tools from our accessibility partners
Thanks to our accessibility audit partners, we could automate a11y tests to measure how our widget performed regarding accessibility and ease of use. This helped us fix many of our a11y issues, test them, and iterate further on any issues efficiently. Using the axe dev tools and the report provided by a11y engineers at the accessibility firm, we improved the usage of most of our components like quick-replies, product cards, carousels, banners, etc.

CleanShot 2024-01-03 at 11.12.22@2x-20240103-054231.png — Here’s a screenshot of one of the reports generated while iterating on accessibility issues.

Outcomes

After implementing these changes, we were not only able to significantly enhance the user experience of our chatbot, but we became a feasible choice for clients such as government agencies, educational institutions, and healthcare providers in North America and Europe, where accessibility is a crucial factor for these industries when selecting software. We are now well-positioned to meet their needs.

The team also learned a lot about building accessible UI and why semantic HTML plays a very important role in building web apps correctly and using the platform as much as possible.

What next?

Now that we’ve achieved WCAG compliance, we cannot repeat the same mistakes while we grow further as a product. We use what we learned from this exercise to ensure every new component that’s built goes through a11y tests and provides the best experience possible to all users. Starting with designs, we ensure the contrast ratio follows the guidelines. During development, we ensure keyboard navigation works seamlessly, along with accessibility tools like VoiceOver and others. During QA, we test if the UI is perceivable, operable, understandable, robust, and conformant.

We are committed to providing the best user experience to all our customers and will continue to add more a11y-friendly features in the future.

A guest post by

Adithya N R

SDE 2 @ yellow.ai

Tech @ Yellow.ai

Discussion about this post