What is Friction?
Friction is an AI-powered usability testing tool. It accepts a Figma design link or a live website URL, generates a set of realistic user personas, and simulates each persona attempting a task on that design - producing a full report with heatmaps, usability scores, and prioritised findings.
The simulation methodology is grounded in Jakob Nielsen's think-aloud protocol and the 12-step usability testing framework - the same approach used in professional UX research labs. The difference is speed and cost: a Friction session takes seconds and runs entirely in the browser without recruiting participants.
Under the hood, Friction uses two Anthropic models. The full session analysis runs on Claude Opus 4.6- Anthropic's most capable model - for its reasoning depth and ability to maintain persona consistency across a structured multi-output response. Persona generation and task suggestion use Claude Haiku 4.5, a faster, lighter model suited to those simpler structured tasks.
Your link
Friction accepts two input types, detected automatically from the URL you paste.
Figma design link
A direct link to a frame or screen in a Figma file. The URL must include a node-id parameter pointing to the specific frame. Friction reads the design and simulates interaction with it without requiring Figma API access.
Live website URL
Any publicly accessible https URL. Friction analyses the live page as a browser would render it. The URL must use https or http - file:// and localhost URLs are not supported in production.
URL detection
Detection checks two things: (1) the protocol must be https: or http:, and (2) if the hostname includes figma.com it is classified as a Figma link; otherwise it is classified as a website. The urlType value ("figma" | "website") is passed through to every API call so the AI can adapt its simulation context accordingly.
Task suggestion
When a valid URL is entered, the client fires a debounced POST to /api/suggest-task after 800ms of inactivity. This calls Haiku with the URL and type, asking it to produce one goal-based task in 12–20 words. The suggestion appears inline and can be accepted or ignored - it does not run automatically.
If you leave the task field blank, the session prompt instructs Opus to infer the most realistic primary task from the URL and persona goals. In most cases, providing a task produces more focused and accurate results.
User personas
Friction generates exactly 3 personas per session via a POST to /api/suggest-personas. The prompt asks Haiku to produce three meaningfully distinct personas who would realistically attempt this task on this product.
Personas are varied across three axes that directly influence how each person experiences a design: technical literacy (Low / Medium / High), product familiarity (First-time / Returning), and emotional state (their mindset entering the session). These three axes were chosen because they are the dimensions most likely to produce divergent scoring - a low-literacy first-time user and a high-literacy returning user will have genuinely different experiences even on a well-designed page.
Persona schema
{
"id": "persona-1",
"name": "Sarah Okafor",
"age": 58,
"occupation": "Semi-retired teacher",
"techLiteracy": "Low",
"productFamiliarity": "First-time",
"emotionalState": "Cautious - wants to book correctly, anxious about making a mistake",
"goal": "Find a double room with breakfast included for a weekend stay"
}Personas are visible on the session page before the analysis runs. You can review the generated set and re-run if the personas don't match your target audience. The personas are passed as a JSON array directly into the main session prompt.
The simulation
The full session runs as a single POST to /api/run-session using Claude Opus 4.6 with a 4,000 token output limit. The entire report - simulations, heatmaps, scores, and findings - is produced in one structured JSON response. Opus was chosen over Sonnet for its stronger instruction-following on complex nested schemas and its more nuanced persona voice differentiation.
The system prompt
The system prompt embeds several layers of constraint that shape the output quality:
The model is instructed to inhabit each persona fully - their age, literacy, emotional state, and goal - and narrate their interaction using first-person internal monologue. Internal monologue must sound like a real person, not a UX audit.
Personas think in goals, not features. The prompt explicitly prohibits action descriptions like "click the button labelled Reserve" - instead it must be "I need to pay, where do I go from here?" This keeps the simulation grounded in user intent.
The prompt distinguishes three hesitation types. Decision hesitation (weighing options) does not penalise clarity or flow. Comprehension hesitation (unclear label or concept) penalises clarity. Orientation hesitation (can't find where to go) penalises flow.
Scores are persona-relative - each persona is scored on what they actually experience. A well-designed page should score high for all personas. Scores only diverge when a specific persona genuinely encounters friction others would not, such as jargon that confuses a novice but not an expert.
The model is forbidden from suggesting further user testing, recruiting participants, running A/B experiments, or any other form of meta-recommendation. Every output must be a concrete design or functionality change.
Step structure
Each persona produces exactly 3 steps - a constraint imposed by the 4,000 token budget across all personas, heatmaps, scores, and findings in a single response. Each step has strict word limits enforced in the prompt:
action12 wordsWhat the persona physically does - scrolls, clicks, reads, abandons.
internalMonologue25 wordsFirst-person think-aloud voice, specific to this UI at this moment.
attention10 wordsWhat they are focused on - not what they should focus on.
outcomeenum"success" | "hesitation" | "failure" - no middle ground.
outcomeDetail12 wordsExactly what happened at this step, tied to the outcome enum.
Heatmaps
Heatmaps are generated per persona and represent predicted attention distribution across the page. Each heatmap produces exactly 4 zones - positioned as percentage coordinates (x, y, width, height as 0–100 values relative to the viewport) - with an attention level and an eye-tracking rationale.
The model is instructed to apply established eye-tracking scan patterns based on the page type. These patterns come from decades of research by the Nielsen Norman Group, the Poynter Institute, and academic eye-tracking studies:
Users scan a full horizontal strip across the top, then a second shorter strip further down, then a vertical strip down the left edge. Content in the right gutter of a text-heavy page is frequently ignored.
The eye travels across the top (logo → CTA), diagonally down-left to the next anchor, then across the bottom. Works well for pages with a single primary action and minimal competing content.
Attention concentrates in the top-left primary optical area and the bottom-right terminal area. The top-right fallow area and bottom-left strong fallow area receive minimal attention.
Attention levels
Hot
Hero headlines, primary CTAs, prices, lead images. First-fixation zones.
Warm
Secondary content scanned en route to the goal - subheadings, supporting text.
Cool
Elements that exist in the layout but rarely receive deliberate attention.
Dead
Ignored entirely - footers, sidebars, elements below the fold for impatient users.
Each heatmap also includes a missedCritical array - elements that should have received high attention given the task (e.g. a "Breakfast included" column header when the task is to find breakfast options) but were predicted to be missed entirely. These often surface the most actionable design findings.
Usability scores
Each persona is scored 1–10 across five dimensions. The overall score is the mean of the five dimensions. averageScore at the report level is the mean of all persona overall scores.
Scores are persona-relative - calibrated to what this specific persona experiences, not an absolute standard. The prompt explicitly enforces that a Low literacy first-time user and a High literacy returning user on the same page must differ by more than 1 point across relevant dimensions if their profiles differ significantly. Compressed scores (e.g. both scoring 6.2 and 6.8) indicate the simulation is not differentiating correctly.
Calibration bands
Near-flawless usability. Industry-leading UX. Extremely rare on first iteration.
Clear, functional, minor friction only. A well-designed commercial site or Figma prototype typically lands here.
Usable but with noticeable friction. Some tasks require effort. Common on first-launch MVPs.
Significant friction. Multiple confusion points or task failures. Needs substantial design work.
Core tasks cannot be completed. Fundamental UX failures present. Must be addressed before any launch.
Dimensions
How clearly does the page communicate what it is and what to do next - for this specific persona. A Low literacy user may score a complex options matrix at 3/10 while a High literacy user scores it 7/10. Poor clarity means users must infer intent rather than read it.
How cleanly the correct path is laid out. Penalised by backtracking, wrong turns, and disorientation - not by speed or deliberation. A persona who reads carefully and follows the correct route without deviation scores high on flow even if they pause frequently.
How well the design works for this persona's specific technical literacy, device type, and cognitive load tolerance. Accounts for dense information architecture on mobile, small touch targets, and jargon that assumes domain knowledge the persona doesn't have.
How clearly the interface communicates the result of every action. Covers loading states, error messages, success confirmations, and inline validation. A form that submits silently - with no visual response - scores 1–2 on feedback regardless of other qualities.
How well the tone, visual language, and content register for this persona emotionally. Includes anxiety from high-stakes decisions (financial risk, irreversible actions) even when the UI is technically clear. A non-refundable booking without reassurance copy scores lower for an anxious first-time buyer than for a seasoned traveller.
Your report
All findings are organised into four categories, capped at 2 items each. The cap is intentional - it forces prioritisation and prevents the report from becoming a noise-heavy laundry list. The most impactful issues surface; minor variations of the same issue are merged.
Every finding must name the specific element and its precise impact. The prompt explicitly flags "navigation is confusing" as an example of an unacceptable finding. A valid equivalent would be: "The checkout CTA sits below the fold on the initial viewport - both low-literacy personas scrolled past it without clicking."
Critical issues
Blockers that prevent task completion. Rated 0–4 on the Nielsen severity scale. Severity 4 means the task is impossible; severity 3 means it seriously impedes completion. Each finding includes the specific element, its location, which personas were affected, and the observed impact.
Friction points
Elements that slow users down or cause confusion without fully blocking them. Includes unclear labels, non-obvious interactive elements, information overload, and layout patterns that violate convention. Each includes a suggested fix (concrete design change, not a generic recommendation).
What's working
Genuinely effective elements - clear hierarchy, well-labelled CTAs, appropriate feedback, natural flow. These are as important as the failures: they tell you what not to change, and give you a baseline for what good looks like in this design.
Recommendations
Specific, prioritised design changes. Every recommendation names the exact element to change, the change to make, and the rationale tied back to observed behaviour. Ranked High / Medium / Low by impact. Critically: Friction never recommends further user testing, A/B experiments, or recruiting participants - only design and functionality changes.
Finding schema
Critical issues and friction points carry structured metadata that maps directly to actionable next steps:
severity0–4 integerNielsen severity scale. Applied to every critical and friction finding. Severity 4 = must fix before launch.
affectedPersonasstring[]Which personas encountered this issue. Tells you whether this is universal or segment-specific.
locationstringWhere on the page the issue occurs. On critical issues, this is the specific element (e.g. "Checkout CTA, below fold on mobile viewport").
suggestedFixstringConcrete design or copy change - on friction findings. 15 words max, always a specific action.
rationalestringLink to the observed evidence. Ties the fix back to what a specific persona actually experienced.
priorityHigh|Medium|LowOn recommendations only. Severity beats frequency - one severity-4 issue outranks ten severity-1 issues.
Architecture
For developers and technical UX practitioners who want to understand what's happening under the hood.
Tech stack
Next.js 15App Router, API routes, server components
TypeScriptStrict mode throughout
Tailwind CSSUtility-first styling, custom design tokens
shadcn/uiHeadless component primitives
Phosphor IconsIcon system
Anthropic SDKClaude API calls - server-side only
ZodAPI route request validation
localStorageSession persistence - client only, no server DB
API routes
POST /api/suggest-taskHaiku 4.5 · 120 tokensCalled on a debounced 800ms timer when a valid URL is entered. Returns one goal-based task suggestion.
POST /api/suggest-personasHaiku 4.5 · 900 tokensCalled once at the start of a session. Returns exactly 3 persona objects as a JSON array.
POST /api/run-sessionOpus 4.6 · 4,000 tokensThe main session call. Produces the complete report - simulations, heatmaps, scores, and findings - as a single structured JSON response.
Session schema
interface Session {
id: string; // UUID v4 - generated client-side at session creation
url: string; // The tested URL
urlType: "figma" | "website";
task?: string; // User-provided or AI-inferred
createdAt: string; // ISO 8601 - used to compute 10-day expiry
personas: Persona[]; // Populated after /api/suggest-personas
report: SessionReport | null; // Populated after /api/run-session
status: "pending" | "complete" | "error";
}Data flow
User pastes URL → client detects type and begins 800ms debounce
Debounce fires → POST /api/suggest-task → task suggestion displayed inline
User clicks Start test → UUID generated, session written to localStorage with status: pending
Client navigates to /session/[id] → POST /api/suggest-personas → 3 personas rendered
POST /api/run-session fires with URL + task + personas → Opus 4.6 responds with full JSON report
Report parsed and written to localStorage → session status updated to complete
UI renders heatmaps, scores, and findings from the stored report
Data & privacy
Friction has no server-side database. Here's the exact data handling:
Sessions live in localStorage only
The session list and all report data are stored under Friction-specific keys in your browser's localStorage. Nothing is written to any server. Clearing your browser data removes all sessions permanently.
Expiry is calculated client-side
Each session stores a createdAt ISO timestamp. On load, the client filters out sessions where Date.now() − createdAt > 10 days. There is no server-side TTL - expiry is enforced by the client on each page visit.
What is sent to the Claude API
The API request includes: the URL you're testing, the urlType, the task description, and the persona array. No browser metadata, IP address, or user identifier is included. The API key is server-side only and never exposed to the client.
Anthropic's data handling
Requests sent to the Claude API are subject to Anthropic's API usage policy. By default, Anthropic does not use API inputs to train models. Refer to Anthropic's privacy policy for full details.
No accounts, no analytics, no cookies
Friction has no login system, no session tracking, no analytics scripts, and no third-party cookies. The only external request the client makes is to Anthropic's API - via the server-side route, not directly from the browser.
