The Two Questions That Tell You If a Candidate Can Operate AI

The whole AI assessment collapses to two questions. Can they build with it? Almost everyone passes. Can they direct it, catch it when it is confidently wrong, and override it? That one sorts the room.

13 min read

To find out whether a candidate can operate AI, you need two questions, not a quiz. First: can they build with it? Second: can they direct it, catch it when it is confidently wrong, and override it? Almost everyone passes the first now. The second sorts the room in about four minutes. If your interview is doing anything more elaborate than this, it is doing less.

Operating AI sits on two axes, and only two. One is what you can make the machine do. The other is whether you can tell it no. A candidate can be strong on the first and empty on the second, and that combination is the most dangerous hire you will see this year: fast, confident, and wrong in ways that only surface on your P&L three weeks later. Everything you want to know about a candidate's AI capability lives in the distance between those two questions. Ask the two that carry the signal, and learn to read the answers.

The short version:

  • The assessment reduces to two questions. Can they build with it? is the Fluency check. Can they direct it? is the Direction check. Fluency is table stakes now. Direction is the hire.
  • You ask the build question as a demo or a story of something they made, not a tool-naming exercise. Almost every candidate clears this. Treat a pass as the floor, not the finish line.
  • You ask the direct question as a story of a time the machine was confidently wrong and they caught it. Then you read how they tell it: what felt off, what they checked, what they rebuilt, against a story that is really about how fast they shipped.
  • The reason this works is that Direction can't be pre-cooked. Wharton researchers found people followed the AI's wrong answer around 80% of the time, growing more confident as accuracy fell. Most candidates have never once argued with the machine and won. The ones who have will tell you a very specific story.

Why the whole thing collapses to two

Most AI-skills interviews are a scavenger hunt. Which tools do you use, how do you prompt, walk me through your workflow, where do agents fit. Six questions, none of which separate anyone, because in 2026 every candidate has a fluent answer to all six. With "AI skills" now on roughly 2.5% of US job postings and climbing, every hiring manager is asking, and almost none can say what the questions screen for. You come out of the room with a warm feeling and no signal.

The scavenger hunt maps to one axis and pretends it is the whole map. Everything it asks about, the tools, the prompts, the workflow, the speed, is Fluency. Fluency is real and necessary. It is also now universal, which means it separates nobody. A screen everyone passes is not a screen. I made that case in full in "must have AI skills" means nothing: the phrase names the floor and calls it the ceiling.

The second axis is Direction: whether you can tell where the output should go, notice when it has drifted, and override it when it is confidently pointing the wrong way. Direction is what the whole role rests on, and almost no interview touches it. Get the two axes clear and the interview shrinks to two questions. Build is Fluency. Direct is Direction. That distinction, the user who runs the tool versus the operator who runs the outcome, is the one I built out in stop hiring AI users, start hiring AI Operators. Two questions are how you find it in a room.

Question one: can they build with it?

Ask this as the easy one, because it is. Don't quiz. Say: show me something you made with AI that you're a little proud of, and walk me through it. Then let them talk.

You are checking a floor, and most candidates clear it comfortably. They will show you an automation they wired together, a piece of analysis they ran, a small tool they got a model to build. Good. That is Fluency, and Fluency is the price of entry. What you are listening for is not brilliance. It is whether they have put their hands on the machine, or whether they are describing a workflow they read about. The tell is texture. A person who built something can tell you the boring middle: the thing that broke, the second attempt, the bit they had to do by hand because the model wouldn't. A person who has only watched gives you a clean, frictionless arc, because friction is the part you only know if you were there.

Here is the trap. The build question feels like the important one, so interviewers spend the whole slot on it, go deep on prompt technique, and leave the room thinking they assessed AI capability. They assessed half of it. The easier half. The half that no longer separates the shortlist. Pass the candidate on Fluency, note it, and move on. The actual hire is decided by the next question.

Question two: can they direct it?

This is the one that sorts the room, and you don't ask it as a hypothetical. Hypotheticals get you the answer the candidate thinks you want. Ask for a memory instead: tell me about a time the AI was confidently wrong, and you caught it. What happened?

Now watch the shape of the story. A candidate who has never once caught the machine being wrong is telling you something by the absence. But mostly, watch how they caught it. There are three things a real story contains, and a manufactured one usually can't fake all three.

What felt off. The Operator can name the moment of suspicion. The output looked clean, the confidence was high, and something still didn't sit right: a number that was too round, a citation that pointed nowhere, a conclusion that arrived too easily. They felt the itch before they had the proof. That instinct, the refusal to accept a plausible answer just because it is plausible, is the raw material of Direction. You cannot fake it in an interview, because a person who has never felt it does not know what it feels like to describe.

What they checked. Suspicion is nothing without the check. The Operator will tell you exactly what they verified: they pulled the source, re-ran the number by hand, tested the edge case the model skipped, asked the one person who would know. This is what separates a genuine catch from a lucky guess. Listen for a specific action taken against a specific doubt. Vagueness here, "I just had a feeling" with nothing after it, is a story about intuition that never got tested, which is not the same as judgment.

What they rebuilt. Then the fix. Not "I flagged it," which is a passenger pointing at the windscreen. The Operator overrode the machine and rebuilt the part that was rotten, and they can tell you which part, why the model's version was wrong, and what theirs did differently. This is Direction completing itself: suspicion, verification, override. A person who took the machine's wrong answer, saw it, checked it, and replaced it with a right one is describing the exact capability you are hiring for.

And here is the anti-signal. Some candidates answer this question with a story that is really about speed. "The AI gave me a draft that was mostly wrong, so I fixed it up fast and shipped by end of day." Listen to where the pride sits. It is on the shipping, not the catching. The wrongness is a speed bump they went over; the achievement is the timeline. That is a Fluency answer wearing a Direction costume. In an operator, the pride sits on the catch. In a fast user, it sits on the clock.

The two questions, side by side

What it tests A strong answer sounds like A weak answer sounds like
Q1: Can they build with it? Fluency: can they make the machine produce useful work "I built X. Here's the part that broke, and what I did by hand when the model couldn't." Texture, friction, a real middle. "I use it across my whole workflow to boost productivity." A clean arc with no friction, describing a process they read about.
Q2: Can they direct it? Direction: can they catch it, check it, and override it when it's confidently wrong "Something felt off, the number was too round. I pulled the source, found it invented, and rebuilt that section from the real data." Suspicion, check, override. "It was wrong so I fixed it up and shipped by end of day." Pride sits on the speed, not the catch. Vague on what felt off or what they verified.

Why direction is the one that can't be faked

The build question can be gamed by a fluent talker. The direct question can't, for a structural reason: to invent a convincing story about catching a confidently wrong machine, you need the exact judgment the story is testing for. You would have to know what wrongness feels like before it is proven, know which check exposes it, and know how a right answer differs from the plausible-wrong one that fooled everyone else. That is Direction. Faking the answer requires having the thing. So the fakers give themselves away. Their stories are about speed, or about a wrongness that was obvious, or they have no story at all.

This matters because the failure mode is measured, not theoretical. In a Wharton study titled Thinking, Fast, Slow, and Artificial, researchers Steven Shaw and Gideon Nave ran three experiments with 1,372 people. Accuracy on the reasoning tasks rose from 46% with no AI to 71% when the AI was right. Then, when the AI was wrong, accuracy fell to 31.5%, below the 46% people managed with no help at all, and participants followed the wrong answer around 80% of the time, confidence rising even as they got it wrong. Shaw and Nave call it cognitive surrender. Read as a hiring fact, it says most people have never won an argument with the machine, because most people never argue. The candidate who can tell you a real, textured story of catching it is describing a habit that four in five people don't have. That rarity is the signal.

Which is also why the question has to be asked live, out of a real memory, in a room. Everything a candidate hands you in advance, the résumé, the take-home, the portfolio, can be produced by the machine you are trying to test them against. I walked through that collapse in every hiring signal AI can now fake. The wrong-answer story survives because it cannot be pre-cooked. There is no artifact to counterfeit. There is only a person telling you about a moment, and either the moment happened to them or it didn't.

What to do with the answers on Monday

Score two things, not twelve. Did they clear the Fluency floor on question one, yes or no. Did they give you a real, three-part catch on question two, suspicion then check then override, or a speed story. That's the assessment. Clear floor plus genuine catch is an Operator. Dazzle on the build and go vague on the catch is a fast user, and the fast user is the expensive hire, because they ship confident mistakes at the speed of the tool.

Two cautions. Don't let a great build answer buy a weak direct answer. They are different axes, and Fluency does not predict Direction. The most fluent candidate in the room is the one most surrendered to the machine, because it has never let them down. And don't accept the direct answer as a checkbox. A story with the word "wrong" in it isn't the signal; the signal is the texture underneath. Asking is not the same as checking, which is the deeper trap I pull apart in how are you actually checking AI skill.

Do this and your interview measures the axis a CV can't carry. Not what a candidate can make AI do, since everyone can, but whether they can tell it no when it is confidently wrong. That is the whole hire. It always was. AI just made it the only part that still separates anyone, and shrank the test that finds it down to two honest questions.


Frequently asked questions

What are the two questions to assess AI skills in an interview? Can they build with it, and can they direct it. The first, asked as a demo or a story of something they made, checks Fluency, and almost everyone passes now. The second, asked as a story of a time the AI was confidently wrong and they caught it, checks Direction, which is the ability to notice, verify, and override the machine. The gap between the two answers is the whole assessment.

How do I ask the build question without turning it into a tool quiz? Don't list tools. Say "show me something you made with AI that you're a little proud of, and walk me through it." Then listen for texture: the thing that broke, the second attempt, the part they did by hand. A person who built something describes friction. A person who only read about the workflow gives you a clean, frictionless arc.

What does a strong answer to the "confidently wrong" question sound like? It has three parts. What felt off, a specific moment of suspicion before there was proof. What they checked, a specific action taken against that doubt, like pulling the source or re-running a number. And what they rebuilt, the override, where they replaced the machine's wrong answer with a right one and can say why. If the story's real pride is on how fast they shipped, that is a Fluency answer dressed as a Direction one.

Why not just give candidates a take-home with a wrong AI answer buried in it? You can, but the take-home is done unsupervised, with the same AI you are testing them against, so a polished result proves the tool works, not that the person can direct it. The live wrong-answer story survives because it cannot be pre-cooked. There is no artifact to fake, only a memory that either happened or didn't.

Isn't a fluent, fast candidate exactly who I want? Only if the fluency comes with direction. The Wharton study by Shaw and Nave found people followed the AI's wrong answer around 80% of the time while growing more confident. A fast, fluent user who never argues with the machine ships confident mistakes at the speed of the tool. Speed without the catch is not an asset. It is a faster route to a wrong answer.

Ivanooo built the AI Operator Profile to measure the second axis at scale: not what a candidate can make AI do, but whether they can direct it when it is confidently wrong. If your interview only ever asks the build question, this is the one it's missing.