top of page

How Duolingo Became an AI Powerhouse: A Product Case Study


Duolingo logo

Recently I installed Duolingo to brush up upon the Spanish skills I learned as part of the B-School curriculum. Duolingo, for those of you who are unaware, is an app where you can learn languages through super-fun gamification. It works on the freemium model with the free model having ads and online-only access.


Totally impressed by their gamified learning experience, I tried to dig deeper into what drives such high levels of accuracy and this excellent and personalized customer experience. Definitely, it was Artificial Intelligence learning from the data generated by a growing user base of millions learning every day on the app.


But at what levels?


What I found was astounding! Duolingo is probably ahead of many large enterprises that boast of being AI-driven organizations. More than the Machine Learning algorithms that are excellent (to say the least), what truly impressed me is their understanding of the core business use cases. Understanding the user experience gaps, focusing on capturing the right and valuable data, and then building a superior algorithm around is what should be the strategy of any organization adopting Artificial Intelligence. Duolingo is one of my favorite case studies for the AI implementation now.


Alright, so it all starts with building a User Profile on the backend when you sign up and updating this profile for every new word that you come across. Duolingo builds a super-detailed profile based on what you know and what you don’t know to the level of EVERY SINGLE WORD.


Next comes the famous Computer Adaptive Placement Test. After signing up for a new language, Duolingo will test your level of knowledge for about five minutes. For example, someone who already has a basic understanding of Spanish, like myself, will start further along than someone who has never used it before.


Once you are past the placement test, you arrive at the main powerhouse of Duolingo – Space Repetition System. It was their very first AI project. It counts the number of times you have seen a word and then very accurately predicts how long you are likely to retain that particular world in your memory. The model is also able to predict if you’ve forgotten something because you haven’t seen it very frequently, or very recently. Within each lesson, Duolingo decides which exercises to give you based on the words and concepts the app believes you need to practice. The specific exercises you are offered vary, so each overall lesson of exercises ends up being different for everyone. It delivers personalized learning over longer intervals for optimal learning rather than cramming lessons into a shorter period of time.


Then comes their machine learning implementation – BirdBrain. It’s funny that before you see a sentence, Duolingo already knows the probability of you getting it wrong without any explanation about what you know and what you don’t, thanks to Birdbrain. So, when you get multiple questions right or wrong more frequently, it adjusts the difficulty basis of your performance. The Ultimate Personalized Learning System. To improve its performance, every night it trains using ~500 million lessons from the day before.


To understand why you got a particular problem wrong, Duolingo uses the BLAME algorithm to assign blame. For example – Whether you didn’t know the word at all or the past participle of the word. It tags every exercise.


The company is also working on an Active Learning feature called Smart Tips. It tries to figure out the root cause to offer you a tip in real-time, for example, the right words in an incorrect order. But it isn’t as simple as it sounds. Each response in each challenge goes through an NLP pipeline. Figuring out the specific mistake to the level of gender and adjective agreement is tough, to say the least. It runs an NLP for the correct answer and for the wrong answer, notices the difference, tries to explain it, and then aggregates it over millions of exercises every day before making suggestions in the form of rules to users, who then collaborate with AI to decide the right set of rules.


And then there is their Regression Model which analyses the error patterns of millions of users on the app to decide the content delivery model that is relevant for each user and their personalized learning needs. Not just that, when you appeal that you have got an answer right while the app shows it wrong, a machine learning model on the backend predicts if your appeal is likely to be accepted by the contributors. Given the huge amount of data that gets generated, they have now developed an interface that rank-orders all of the reports so that they can find the most salient ones to fix first.


To make language learning more realistic, Duolingo also uses AI-powered chatbots that teach language through automated text-based conversations. Not only do they help in learning how to converse in a new language but also improve and become smarter each day with more and more data generated through these conversations.


That was about the core Product.


But it doesn’t end there. Duolingo also uses its supreme AI capabilities for engagement. The app sends you one notification each day, reminding you to practice. They have named their notification algorithm as Bandit which learns not just when to send you the notification but also What to send you. It learns from the data that you provide. Based on your response to the notification on the first day, and the actions that you take, it learns and gets better.


Now, even to someone who is very fascinated by technology this whole ecosystem would seem to appear perfect, a solved problem. But Duolingo doesn’t want to stop here. They are considering venturing into Virtual Reality for more immersive experiences. As the AI and Research Head at Duolingo, Burr Settles says, “The core part of AI strategy is to get as close as possible to having a human-to-human learning experience.”


During the pandemic, Ed-tech has witnessed a huge surge in usage, right from educational institutions looking for more effective ways of teaching to businesses looking for optimal training programs for employees. This trend is likely to remain even post-pandemic. With such favorable conditions, a sophisticated technical infrastructure, and a huge amount of user-specific data, the growth story will be definitely fascinating in the next 5-10 years for Duolingo.


Recently in the month of November, Duolingo raised $35 Million in Series H funding from General Atlantic and Durable Capital Partners (Source: Crunchbase) at a valuation of $2.4 Billion. With such technical prowess, a large user base and user data, and the growing Ed-tech industry, an acquisition somewhere close to ~4 Billion USD (or perhaps more) in the next 2-3 years cannot be ruled out.

Subscribe to Insights on Products, Tech, and Business.

Thanks for subscribing! No Spam is a promise.

bottom of page