dConcept/ hk
AI at work14 min read

Start with workflows, not AI tools

How to find where AI actually helps. Map the workflow first, name the job, then pick the tool. Includes a five-part audit you can run in an afternoon.

Key takeaways

  • AI work should start with the workflow, not the tool.
  • A useful workflow has a trigger, inputs, decisions, outputs and visible drag.
  • The first pilots should be high-impact, low-risk and easy to review.

Why the tool question comes too early

Most adoption problems are not about missing ChatGPT seats. They are about unclear jobs. When someone asks which AI to use for work, they are usually trying to skip the boring part: naming the task tightly enough that any tool would matter.

The market answers that question with a hundred products, each promising to "10x" something. Without a workflow to anchor them, those products become entertainment or shelf-ware. You buy seats, run a few flashy demos, and quietly go back to spreadsheets because nobody defined the trigger, the inputs, the checker, or the output format.

Starting with the workflow does not mean ignoring tools forever. It means waiting until you can describe what success looks like for one specific task someone already does on a calendar.

What counts as a workflow (and what does not)

A workflow is repeatable work with parts you can point at. Something that happens on a cadence (weekly report, monthly forecast, intake triage), or always starts from a similar blank page (brief, scope, reply, summary). One-off strategy off-sites still matter, but they are bad first candidates because you cannot rehearse them cheaply or measure the outcome.

Vague missions like "be more innovative with AI" rarely produce anything durable. You want something you can sketch end to end: triggers (email arrives, Monday 9 a.m., ticket created), people involved, hand-offs, the actual artefact (doc, row, message), and the friction you can put a clock or a counter on.

If you cannot sketch that on half a page, do the clarity work first. Otherwise you will tune prompts for a story that changes every time you tell it.

A five-part audit you can reuse

Use the same scaffold for every candidate task. Keep answers short. Half a dozen lines, not an essay.

Trigger. What kicks this work off reliably? Frequency matters. Daily beats yearly for learning loops.

Inputs. The raw materials. Files, inbox threads, URLs, voice notes. Note whether they are messy, semi-structured, or already in rows and columns.

Decisions. Where does judgment sit? Classification, drafting tone, prioritisation, anomaly detection. These are different cognitive jobs, both for humans and for models.

Outputs. The artefact someone else recognises as done. Slides, CSV rows, replies, tickets, summaries. If you plan to automate, lock the format tightly.

Drag. Where do the minutes leak? Transcribing, rewriting the same opener, hopping between tabs, reconciling numbers, rewriting history for a stakeholder.

Once those five blocks fit on one page, classify the job honestly. Drafting, rewriting, extracting, clustering, translating, retrieving, scripting, coordinating. Mislabelling the job is how you end up asking a chat assistant to behave like a database, or asking automation to behave like counselling.

Mapping impact and risk before touching a model

Not every bottleneck deserves AI yet. Plot candidates on two axes: time saved or quality gained versus cost of being wrong. High impact and low reversible risk wins first.

High impact with high stakes is a different conversation. Anything legal-adjacent, pricing, redundancy, or regulated advice belongs behind human review, narrower tools, and possibly private deployments. Not open-ended prompting on a consumer account.

Low impact nuisances rarely justify custom automation. Sometimes a trimmed template solves 80 percent of the pain. Be willing to refuse work that only looks glamorous because the buzzwords pile up around it.

Match the job to capability, not hype

Draft-first writing works when tone and structure repeat. It fails when factual grounding has to be perfect and you have not attached sources.

Extraction (pulling fields out of unstructured text) is solid when the schema is steady. It breaks when departments rename columns every fortnight.

Classification and routing work when you have labelled examples. Without examples you are prototyping policy, not shipping ML.

Research assistance holds up when constraints are explicit. Jurisdiction, timeframe, authoritative sources. Not “tell me everything about this market segment.”

Light automation glues systems together when the triggers are deterministic. Brittle selectors and changing UIs eat maintenance hours nobody planned for.

Treat each of these as a hypothesis, not a verdict. Pilot on copies of real artefacts and measure rework minutes before celebrating.

Failure modes when you skip workflow thinking

Tool-first pilots usually plateau at “cool demo” because nobody owns the maintenance. Prompts go stale, CSV headers shift, and the person who understood the glue leaves.

Teams also blame the model when the real problem was ambiguous success criteria. If two reasonable humans disagree on whether the output is correct, the model will swing too.

Then there is shadow usage. People route sensitive data through consumer tools because the official channels feel slow. That is a policy signal: your sanctioned path is not solving a real loop. Fix the loop, not just the policy PDF.

What to do next week

Pick one recurring task you personally own. Run the five-part audit on paper. Show it to a peer and challenge each other on whether the job label is honest.

Only then compare tools. Which ones accept your data posture, support the review you need, and survive the next org reshuffle? If the answer is still unclear, that is useful. It means you need conversation and iteration, not another pricing page.

When one loop works in the open, document the trigger, the template, and the check step so others can copy the pattern. That is how AI practice compounds without turning into a transformation programme.