
Project Overview
Speako is...
A ChatGPT-powered mobile app designed to help intermediate language learners practise speaking using their own photos. Many learners feel stuck when chatting with AI, unsure how to start or keep the conversation going. Speako tackles this by letting users upload personal photos to spark more natural, engaging conversations.
By grounding chats in real-life moments, the app reduces the mental effort often involved in speaking a foreign language and helps learners connect new vocabulary with their own memories.
The final solution included a high-fidelity prototype paired with a customised GPT model, both tested with real users to validate their effectiveness in supporting effortless, meaningful spoken English practice.
My Role
Product Designer,
User Researcher,
UI/UX Designer...
Project Type
Self-Initiated
Timeline
Summer-Winter 2024
Team
Just Me
Design Process
A research-led, human-focused Design Thinking process customised for Speako
I followed the Stanford Design Thinking framework to guide the research, design, and testing of Speako. This human-centred approach helped me deeply understand users’ needs and continuously iterate based on real feedback. Its flexibility also made it ideal for working on an experimental project where the outcome wasn’t always predictable.
Things didn’t always go as planned. Some early concepts didn’t fully solve user problems, which pushed me to pivot, dig deeper into the research, and refine ideas through testing. I often jumped between stages to quickly adapt and improve. Through trial and error, continuous user input, and rapid iteration, I shaped an experience that solves user problems and bring them joy.

Outcomes
Developed a concept that made AI conversations 2× longer, 20% easier to start, and 32% more enjoyable for intermediate English learners.
Designed a high-fidelity mobile app that achieved an NPS of +32, a 32-point increase over using the GPT model alone for spoken English practice.
Planned and conducted 4 iterative rounds of user testing, ensuring each design decision was grounded in real feedback and meaningfully improved usability at every stage.
Deep dive to my design process
The Problem (My Initial Motivation)
We all know conversations improve language proficiency, but why is it still so hard to practise with AI?
Speaking is often the hardest language skill to improve, especially without regular access to real conversation. Traditional classrooms tend to focus on grammar and writing, while 1-on-1 speaking practice with tutors can be expensive and anxiety-inducing. As a result, many language learners have started turning to AI tools like ChatGPT for more flexible, unjudgemental speaking practise.
However, in early conversations with a few friends and social media research, I discovered a common pattern: they often stop using ChatGPT to practise speaking after just a week or two.
Something wasn’t clicking. The initial excitement faded quickly, leaving users feeling awkward, lost, or unsure how to keep going.
Marketing Gap in Language Learning Apps
Language apps are booming, but non-beginners are still underserved, especially when it comes to speaking practise.
1/4 of users are underserved (Civic Science, 2023)
While 23% of language app users aim for fluency, most products are designed for curiosity-driven beginners — offering shallow gamified content / scripted exercises that can’t support deeper, personalised speaking practise.
2% retention by day 30 (Business of Apps, 2025)
Education apps have one of the lowest retention rates. Like with ChatGPT, users tend to drop off once the novelty fades.
My Research Aimed to Explore
Why do learners struggle to stick with AI-driven conversation practise, and how might we create an experience that keeps them engaged?
User Research
Combining interviews and observational studies helped me understand both what users said and what they actually did.
While interviews helped uncover learners’ thoughts, feelings, and past experiences with AI speaking practice, observation allowed me to see real-time behaviours and reactions that users might not consciously mention.
🗣️ 7 User Interviews
Semi-structured conversations
Explored motivations, frustrations, and emotional experiences in AI speaking practice
Focused on uncovering what learners think and feel about using AI
👀 4 Combined Observational Studies
Live casual practice sessions with ChatGPT or competitor apps
Observed real-time behaviours: hesitation points, engagement drops, emotional shifts
Focused on capturing unspoken challenges







7 Online User Interviews with 4 Observational Studies
Synthesising Findings
Language learners ask for instant feedback and corrections, but their real struggles go deeper.
After user interviews and observational studies, I synthesised the findings using an affinity diagram to group common themes, behaviours, and emotional patterns. This process helped me move beyond surface-level requests and uncover deeper, recurring problems across users from different backgrounds.

Affinity Diagram: Each sticky note represents a thought, a feeling, or a frustration shared by real users
Key User Insights
Why AI alone isn’t enough, and what users really need ——
From synthesising the research, I uncovered three key insights that shaped the design challenge. Solving these problems would not only meet user needs more meaningfully but also create an opportunity to stand out in the crowded AI language learning market.
Insight 1: AI can talk about anything that is interesting to users, but after a busy day, users often do not know “What’s in their mind” when starting a conversation
Although AI offers endless conversation possibilities, learners often feel tired when deciding what to talk about in their busy daily lives. Meanwhile, rigid options like generic curricula or scripted scenarios also feel irrelevant and disconnected, making conversations equally "boring" and "pointless".
How might we make it effortless for users to start a meaningful conversation, so they can feel more easily to practise every day?
Insight 2: High cognitive load of formalising what to say during a conversation puts users off
Learners experience mental fatigue when they must constantly think about how to frame their responses (e.g., in roleplay scenarios). Without meaningful / specific content to anchor the interaction, the effort of formalising thoughts becomes overwhelming, causing conversations to lose momentum and making practice feel tiring, empty, and easy to abandon.
How might we help users find what to say during a conversation in a meaningful way, so they can stay engaged for longer?
Insight 3: Without effective feedback support and meaningful reflection after a conversation, practice feels pointless and forgettable, leaving users unsure if they’re improving at all
Learners struggle to retain new vocabulary or remember their mistakes when there’s no space for reflection or easy-to-digest feedback. This is also something many language learning apps fail to provide. As a result, motivation fades, and speaking practice starts to feel like a dead end.
How might we support learners to reflect on their speaking practice and absorb feedback in a way that feels effortless, meaningful, and motivating?
Expert insights
”You need to win the hearts and minds of the user, so they know they’re using it to truly learn, not just as a tool of convenience.”
To complement user research, I conducted expert interviews with two experienced English language instructors from the Kingston Language Scheme (KLS) programme. Both experts had extensive teaching experience spanning 10 to 30 years across various learner levels.


Online Interviews with Two Language Tutors
The insights they shared helped me prioritise what truly benefits learners, guiding decisions in my later design process.
Intermediate learners want to use what they already know in richer ways in real life, not just learn more grammar or vocabulary.
Learning happens more through reformulation and recognition than direct correction.
Prioritise reinforcing correct patterns naturally rather than interrupting conversations with constant corrections.
Flexibility is essential. Learners need to practise when they feel most ready.
Make speaking practise easily accessible anytime anywhere, supporting spontaneous short practise bursts.
Personas
Meet intermediate English learners
I focused on intermediate English learners — people who already know the basics and grammar, but struggle to find the right way to keep speaking.
They’re not looking for lessons or drills. They want to talk, to practise thinking in English, to broaden vocabulary naturally, express themselves more confidently and effortlessly.
Lower-Intermediate English
🧍♀️ Anastasiya | 22 | Ukrainian | Student in Glasgow
😟 Challenge: Often gets stuck mid-sentence or feel nervous and overwhelmed in English conversations
🎯 Goal: Gain confidence and fluency

Upper-Intermediate English
🧍♀️ Joyce | 27 | Taiwanese | Marketing Manager Assistant in London
😟 Challenge: Feels frustrated when she can’t accurately express herself.
🎯 Goal: Move beyond functional fluency to speak with more ease, nuance, and authenticity

User Journey Maps of Current Experiences
Current UJMs: Emotional Drop-Off in Speaking Practice
I mapped the journeys of two target users as they used ChatGPT for spoken English. I chose ChatGPT because it's widely accessible, voice-enabled, and increasingly used by learners seeking casual, self-directed practice.
My goal was to uncover where it falls short for users. These insights helped identify key opportunities to design a more focused tool which goes beyond ChatGPT and current language learning products.


💡 Why Design Intervention Was Needed
Both user journeys revealed a clear and rapid emotional decline — starting with early curiosity and hope, but quickly shifting to disconnection and disengagement.
Two major user and business problems were prioritised through the UJMs:
❗Starting and sustaining conversations feels mentally exhausting
💼 Business impact: Low activation and early drop-off. Users fail to build a consistent habit and abandon the app after initial use.
❗️Lack of feedback and reflection makes practice feel unproductive and pointless
💼 Business impact: Low perceived value. Users don’t feel they’re improving and are unlikely to return or recommend the app.
Define the Key Design Challenges
❓ How Might We...
help users start and sustain a conversation with less effort, so that practice becomes easier to return to, even after a long day when energy is low?
support users in absorbing new vocabulary and feedback in a way that feels effortless and meaningful, so that each session feels useful and leads to sensible progress?
Ideate
Using HMW prompts, rapid sketching, and AI collaboration throughout the design process
As a solo designer, I relied on a combination of Crazy 8 sketching and AI as a creative partner, guided by my “How Might We” questions, to drive ideation at different stages.
I didn’t treat ideation as a one-time activity. I revisited it continuously throughout the project to quickly generate and explore ideas, from early concepts to more detailed interaction design later on.

Part of My Crazy 8 Ideation Sketches
Trials & Errors
❌ Spending too much time developing a concept that didn’t solve the core problem
Early on, I explored a concept built around “topic cards” — customisable categories users could select to guide their conversations.
I built a low-fidelity prototype and quickly tested it with just two users without gathering any quantitative data. Initial verbal feedback was positive: users liked having ownership over their topics and praised the in-conversation interactions. Based on their feedback, I developed a more detailed wireframe to test usability and designed additional features around it.
However, further testing with five users revealed a critical issue: while the feedback and correction features were well-received, the core concept lacked desirability.

Low-Fi Prototype Made in Miro

Mid-fidelity wireframe prototype made in Figma

Learn from Failure
😔 User testing data revealed a lack of emotional engagement — there was no spark to draw users in.
Despite positive feedback on the AI corrections and structured feedback features, post-test surveys showed a clear mismatch: users rated the experience as user-friendly, but the desirability score remained low. The experience made sense, but it didn’t excite users or make them want to return.
After analysing the qualitative feedback, I discovered a key gap: the emotional engagement was weakest at the very start of the experience. The “topic cards” didn’t generate enough curiosity or relevance to pull users into a meaningful conversation.
It became clear that choosing a topic alone wasn’t enough, they needed a stronger emotional hook to feel motivated and personally connected.



Trials and Errors
Key Findings from Deeper Research
Speaking about learner-generated photos that are personally meaningful and mentally familiar increases words used in a conversation with less mental effort.
A 2022 study (Huynh, Lin, & Hwang) showed that learner-generated photos — compared to textbook prompts — significantly improved:
✅ Fluency (more words per minute)
✅ Vocabulary diversity
✅ Reduction of intrinsic mental load
✅ Motivation and engagement
✅ Sense of autonomy

I made the literature mind map to help me connect my findings
A New Fomular for the Solution
Learner-generated photos + ChatGPT = Less mental effort & more real engagement
Concept Validation with GPT Model
Speako made conversations 2× longer, 20% easier to start, and 32% more enjoyable
To evaluate whether learner-generated photos help reduce the effort and increase engagement of chatting in English, I designed and ran a concept test with 8 intermediate English learners, split into two groups:
Group A (4 users): Used standard ChatGPT to have a casual verbal chat.
Group B (4 users): Used Speako’s customised GPT, where conversations were prompted by a personal photo and the session was customised for English learning purpose.
Participants chatted freely until they felt finished. I gathered objective timing data from recorded sessions and perceptual feedback from a post-conversation survey.

A GPT model was customised to validate the idea of Speako

Speako’s concept testing results
💡 What This Means for Speako's App Design
Photo prompts need stronger onboarding support: Although Speako users found it easier to start conversations overall, the slightly longer start time suggests uncertainty when selecting or framing a photo. The app could be designed to reduce this friction.
Mid-conversation support is needed: The slight rise in difficulty maintaining conversation suggests opportunities to inspire or motivate users when they feel stuck.
UJM of Expected Experiences
Mapping the ideal experience to define what matters most
After further ideation around Speako’s key features, I mapped out new expected user journey maps for both user groups.
These updated journeys allowed me to:
✅ Visualise how my proposed features would support users at each stage of their experience
✅ Prioritise the features that delivered the greatest emotional and functional impact
✅ Ensure that the user flow aligned with their real-world routines, motivations, and pain points

Iterating the Information Architecture
Structuring the app to reduce friction and support consistent practice
As the concept for Speako evolved, so did its structure. I created and iterated on multiple versions of the information architecture to reflect changing user flows, to define feature scope, and to ensure both cohesion and manageable complexity.
Creating IA helped me answer one key question:
How can I organise the app in a way that reduces friction, supports consistent practice, and feels intuitive to both beginner and advanced users?



Information Architecture
Final Solutions
Speako: From “What’s in your mind?” to “What’s in your photo?”
Speako is a new kind of practice partner who helps users speak more fluently by turning everyday moments into meaningful, low-pressure speaking practice. By using learner-generated photos as prompts, Speako lowers the barrier to starting a conversation and makes language learning feel natural, personal, and effective.
Final Solutions

🔆 Talk Today and Story Time: Turn Your Photos Into Conversations
Talk Today: A lunch, a sunset walk, a book you’ve just bought... just snap and send. Everyday moments become effortless conversation practice.
Story Time: Pull out an old photo, like that trip to Barcelona or a random moment you love, and turn it into a story worth telling. Great for practising complex tense, storytelling, and descriptive language.
“I feel like I don’t have to make things up anymore... Usually with AI, my mind just goes blank.” (From User)
✅ What This Solves:
Effortless Prompts: No need to think up conversation topics. Users’ camera roll is already a natural topic pool.
Emotional Connection: Talking about something personal makes the experience more engaging, and easier to remember.
Guided Freedom: Free enough to talk about anything, supported enough to feel guided.
💬 In-Conversation Interactions: Support Without Disruption
“I LOVE the word inspiration. I want it now.” (From User)
Add Emoji with Photo: Set the Mood

Word Inspiration: A Gentle Nudge

Real-Time Correction: Learn As You Speak


📝 Post-convo Feedback: Understand How You're Doing
“Is it writing a diary for me? I wanna see it.” (From User)
✅ What This Solves:
Bite-sized Feedback: Clear highlights show what went well and what to improve. Easy to digest, even for busy learners.
CTA to Journal: The recap links straight to a personalised journal entry, motivating users to speak more throughout the day to “unlock” their story.
Bridges the Gap between AI and Teaching: Feedback feels human, supportive, and motivating, just like a good teacher would.

🔤 Vocabulary in Context: Not Just Another Flashcard
“Flashcards never worked for me... but I can see myself easily remember the word ‘flaky’ by seeing that cinnamon bun now.” (From User)
✅ What This Solves:
Contextual Recall: Users can revisit words in the exact moment they used them. You don’t just remember the word, you remember why you said it.
Deeper Retention: By linking words to emotions, places, and personal stories, vocabulary becomes easier to recall and reuse.
Breaks the Plateau: Helps learners push past the common “vocab plateau”.
🧩 Something Else...
Recap x Google Timeline: See where your English took you today, encouraging spontaneous speaking throughout the day.

Free Talk Mode: A space to ask AI to do anything about language learning or just speak your mind

Widgets (Dark & Light Mode): A daily nudge to speak, right on your home screen

Testing Results
Speako vs ChatGPT

Other Metrics

Branding

Reflection
If I had more time, I would...
Design for edge cases
With AI products, users won’t always behave the way you expect. They might upload unusual photos, ask unexpected questions, or use the tool in creative ways. If I had more time, I would have explored how the app responds to edge-case behaviours and added safeguards or fallback experiences to handle unpredictable interactions more gracefully.
Address privacy
Some users mentioned the privacy issue of uploading photos. Given more time, I’d design clearer privacy settings, data-use explanations, and opt-in moments to reassure users that their content is handled with care and transparency.
Get more users for concept validation and testing
If time allowed, I would have reached out to a broader and more diverse group of learners to validate the concept more rigorously and capture a wider range of perspectives.
One thing I’d do differently...
Test earlier and more often
One of my biggest takeaways: don't wait. I could’ve identified major issues in the original concept earlier with quick, scrappy tests. It would’ve saved time and led to stronger design decisions from the start.