Spec-Driven Development with GitHub Copilot

You don't need to know what a specification is, or how to write one, to work in a structured, deliberate way with Copilot. The Plan agent does that for you - it interviews you about what you want, asks the right clarifying questions, and turns your answers into a concrete implementation plan before a single line of code is written.

This section is designed as a tutorial which walks through the full cycle using a real example: a Python tool that reads an Excel timesheet export from Maconomy and produces a project hours summary.

A sample file is included in this folder: Fictional Timesheet.xlsx - a realistic Maconomy export covering two years of consultant time, with three sheets (Week, Month, Year), ~1,300 data rows, and columns like Client Working Hours, Non-Client Working Hours, Absence, and ISO Week numbers. Use it as your test data throughout the exercise.


Why This Workflow?

Most people jump straight to asking Copilot to write code. That works for small tasks - but for anything beyond a few lines, it leads to code that almost-but-not-quite does what you wanted, which you then spend time patching rather than thinking clearly. The workflow in this tutorial builds good habits:

Phase What you do Why it matters
Plan Describe what you want; Copilot asks questions and writes a spec Catches misunderstandingsbefore they become code. Cheap to change a plan; expensive to change working code.
Implement Copilot writes the code step by step, you review each change You stay in control. Every change is shown as a diff - read it and push back if it's wrong.
Run it Copilot runs the tool with your test data Confirms the code actually works end-to-end before you invest time in tests.
Test Copilot writes automated tests for the behaviour you described Tests prove the code does what you agreed. They catch regressions when you change things later. You don't need to know how to write tests - just ask.
Code review Copilot reviews what it wrote, looking for bugs and gaps AI-generated code contains mistakes. A second pass - especially with adifferent model - catches things the first pass missed.

Before You Start

Open VS Code in an empty folder - this will be your project. All files Copilot creates will land here.

Create the folder anywhere you like, then open it: File → Open Folder.

Copy Fictional Timesheet.xlsx from this tutorial folder into your project folder. You'll use it as test data throughout the exercise.


The Idea: Project Hours Summary Tool

AFRY consultants often export their logged hours from Maconomy into Excel. The resulting file is more complex than a flat table - it has three sheets at different levels of granularity (daily, monthly, yearly), separate columns for chargeable and non-chargeable hours, and ISO week numbers. Summarising that manually - utilisation rates, totals per project, trends by week - is tedious and error-prone.

We'll build a small Python tool that:

  • Reads the Week sheet from a Maconomy .xlsx export
  • Calculates chargeable utilisation - client hours as a percentage of total working hours
  • Summarises hours per project and per ISO week
  • Optionally filters by date range or project code
  • Prints the summary to the terminal

You don't need to know any of this in advance - you'll just describe the rough idea to Copilot, and the Plan agent will figure out the details with you.


Step 1: Inspect the Data

Before starting the planning interview, give Copilot a chance to look at the actual file. This makes the planning conversation much sharper - instead of asking generic questions about column names, Copilot will ask specific, useful questions based on what it can see.

Use the regular Agent for this step, not Plan. Plan mode cannot run terminal commands, so it cannot open or inspect an Excel file. You'll switch to Plan in the next step.

  1. Open the Chat panel (Ctrl+Alt+I)
  2. Click the paperclip icon in the chat input bar and attach Fictional Timesheet.xlsx (or drag it directly into the chat)
  3. Ask Copilot to describe what it sees:
I've attached a Maconomy timesheet export. Can you describe its structure?
How many sheets does it have, what are the column names on each, and what
kind of data is in each column?

Copilot will enumerate the three sheets, the column names, and what the data looks like.

Why this matters: Without seeing the file, Copilot asks generic questions and makes assumptions. With the file attached, it asks sharper questions - "should I use the Week sheet or the Month sheet?" and "should absence days be excluded from the utilisation calculation?" - which lead to a much more accurate implementation plan.


Step 2: Create the Plan

Open the Chat panel (Ctrl+Alt+I) and select Plan from the agents dropdown.

The file context from Step 1 is still available in the conversation - no need to attach it again. Describe your idea:

I want to build a Python tool that reads this Maconomy timesheet export
and gives me a utilisation summary - how much of my time was chargeable,
broken down by project and by week. Let us work on the specifications together.

That's it. Press Enter.

The Interview

The Plan agent will respond with clarifying questions. This is intentional - it's extracting the requirements from you rather than making assumptions. Answer honestly and conversationally.

A typical exchange might look like:

Copilot: The file has three sheets - Week, Month, and Year. Which should I use as the primary data source?

You: The Week sheet. It has one row per project entry per day, which gives the most flexibility for filtering and grouping.

Copilot: The Week sheet has both Client Working Hours and Non-Client Working Hours columns. Should utilisation be based only on client hours, or should I include non-client hours in the total too?

You: Total hours in the denominator (client + non-client), client hours in the numerator.

Copilot: Should absence days be excluded from the utilisation calculation, or treated as non-chargeable time?

You: Excluded. I only want to measure utilisation across days I was actually working.

Copilot: Should the project summary show Subproject No. (e.g. GBR002100-001), Subproject Name, or both?

You: Both - the code and the name next to it.

Copilot: Should this be a command-line tool with arguments, or a script you just run directly?

You: CLI. At minimum an argument for the input file path, and optionally a date range.

Keep answering until Copilot stops asking. If you're unsure about something, say so - "I don't know, you decide" is a perfectly valid answer.


Step 3: Review and Refine the Plan

Once the interview is done, the Plan agent produces a structured implementation plan. It will look something like:

## Implementation Plan: Maconomy Hours Summary Tool

1. Set up project structure
   - Create hours_summary.py
   - Add requirements.txt (pandas, openpyxl, tabulate)

2. Implement Excel reader
   - Read the "Week" sheet from the .xlsx file using pandas
   - Validate expected columns: Subproject No., Subproject Name, Task,
     Date, Hours, Client Working Hours, Non-Client Working Hours, Absence
   - Parse Date as datetime; parse Week as integer
   - Handle missing file or unrecognised format gracefully

3. Implement utilisation logic
   - Exclude absence rows (where Absence > 0) before calculating utilisation
   - Utilisation % = Client Working Hours / (Client + Non-Client) * 100
   - Aggregate per Subproject No. + Subproject Name

4. Implement weekly breakdown
   - Group by ISO Week number; show total and client hours per week
   - Highlight the highest- and lowest-utilisation weeks

5. Implement output
   - Print summary tables to terminal using tabulate:
     (a) Overall utilisation % and total hours
     (b) Hours and utilisation % per project
     (c) Hours per ISO week

6. Implement CLI
   - --input: path to .xlsx file
   - --from / --to: optional date range filter (YYYY-MM-DD)
   - --project: optional filter by Subproject No.

Read the plan carefully. This is the moment to catch misunderstandings before they become code.

  • Does it match what you had in mind?
  • Is there anything missing or wrong?

Push back with follow-up messages if needed:

"I don't want pandas as a dependency - use openpyxl directly"

"Add a --week argument so I can filter by ISO week number"

"Don't bother with the output Excel file for now - just terminal output"

Iterate until the plan is right. It's cheap to change a plan. It's expensive to change code.

When you're happy, click Open in editor - this opens the plan as an editable document in VS Code. Read through it once more and make any final tweaks directly in the file.

Tip: The Plan agent automatically saves the plan to /memories/session/plan.md. You can retrieve it later via the Command Palette: Chat: Show Memory Files.

Once you're done reviewing, open a new chat (Ctrl+Alt+I), make sure the regular Agent is selected, and kick off the implementation:

Please implement the plan in #file:plan.md

Step 4: Implement

After handing off from Plan, the Agent agent takes over. It will:

  • Create the project files and folder structure (hours_summary.py, requirements.txt, etc.)
  • Write the implementation following the plan step by step
  • Run terminal commands as needed (e.g., pip install openpyxl tabulate)
  • Show you inline diffs for every file change

What to Watch For

Review each file change before accepting. The agent shows diffs inline - use the overlay controls to Keep or Undo individual changes. Don't accept everything blindly.

Answer permission prompts honestly. If the agent asks to run a terminal command you don't recognise, ask it first:

What does this command do and why is it needed?

Steer mid-flight if needed. If the implementation diverges from what you discussed in the plan, say so:

The output should show a table grouped by project first, then employees within each project - not a flat list.

Keeping the Agent on Track

If the conversation grows long, the agent may start to drift. You can re-anchor it to the plan at any point:

Look at the plan in #file:plan.md. The --project filter hasn't been implemented yet.
Please implement that next.

Step 5: Run It

Before writing tests, confirm the tool actually works end-to-end with your sample data.

Ask the Agent:

Please run the tool using Fictional Timesheet.xlsx as the input file.

The Agent will run the command in the terminal and show you the output. Check:

  • Does it print output without errors?
  • Do the numbers look plausible? (Spot-check a week manually against the raw data.)

If something is wrong, describe what you expected and let the Agent fix it before moving on to tests.


Step 6: Write Tests

You don't need to know how to write tests, or even what to test - just ask:

Please write a thorough set of unit tests for this tool.

The Agent will read the implementation and generate a test file covering the main behaviours. It will also run the tests and fix any failures it finds.

Once the initial tests are passing, push for more coverage:

  • Edge cases: "Add a test for a file where all hours are 0"
  • Invalid inputs: "Test what happens if the Hours column contains text instead of numbers"
  • Boundary conditions: "Test with a single row, and with an empty sheet"
  • Your specific requirements: "Add a test that confirms absence rows are excluded from the utilisation denominator"

Test-First Variant

If you want to try a more disciplined approach, you can ask for tests before the implementation. Go back to Step 3, and before opening the plan in the editor, add:

Before we implement, please also add a set of failing unit tests to the plan
that verify the key behaviours. The implementation should make these tests pass.

Step 7: Code Review

AI-generated code is not always correct - it can contain bugs, miss edge cases, and make poor design decisions. Asking Copilot to review what it just wrote is a cheap way to catch problems before they cause trouble.

Tip: switch models before reviewing. The model that wrote the code is likely to miss the same things it missed when writing. Switch to a different model in the chat model picker - a fresh perspective catches more.

Self-Review Before Committing

Review #file:hours_summary.py for:
1. Potential bugs or unhandled edge cases
2. Input validation - what if the Excel file has unexpected column names or types?
3. Anything that would make this fragile or hard to maintain

Spec Compliance Check

Reference the plan Copilot generated earlier:

Based on our earlier implementation plan, does hours_summary.py implement
everything we agreed on? Are there any gaps?

Focused Review Prompts

Robustness:

What happens if the Hours column contains a blank cell or the text "n/a"?
Is this handled gracefully?

Correctness:

Does the per-project summary correctly aggregate hours when the same
employee appears across multiple rows for the same project?

What Copilot Catches Well vs. What Needs Human Eyes

Copilot catches well Needs human judgement
Missing error handling Whether the output format is actually useful
Off-by-one and aggregation errors Whether the column names match your real files
Unhandled file/IO exceptions Whether the filtering logic fits your team's workflow
Inconsistency with the agreed plan Performance with very large files

Implementing the Fixes

Once the review is done, apply everything in one go:

Please apply all the fixes you identified.

Full Workflow Summary

[Agent] Inspect the data file
[Plan]  Describe your idea → answer interview questions
        Review and edit the plan (Open in editor)
[Agent] Implement the plan → review diffs
        Run the tool with test data
        Ask Agent to write tests → review and extend
        Switch model → ask for code review → apply fixes

Tips for Getting the Best Results

You Don't Need to Know Everything Upfront

The whole point of starting in Plan mode is that Copilot extracts the requirements from you through conversation. If you don't know the answer to one of its questions, say so - it will suggest a reasonable default.

Iterate on the Plan Before Implementing

It's cheap to change a plan. It's expensive to change code. Spend time in Plan until you're genuinely happy with the steps before clicking Start Implementation.

Re-anchor With the Plan if the Agent Drifts

In a long session, the Agent can forget earlier decisions. Remind it:

Refer back to the plan we agreed on and check you haven't missed anything.

Review Diffs, Don't Blindly Accept

Every change the Agent makes is shown as a diff. Read them. You are responsible for the code - Copilot is your assistant, not your replacement.

Save the Plan as a Spec

After a successful session, ask Copilot to write the plan out as a docs/spec.md file. This becomes your documentation, your onboarding guide, and your starting point for the next session.

Please write a clean specification document based on everything we built
and save it as docs/spec.md

Extension Exercises

Once the core tool works, try these - each one requires a new planning conversation.

1. Utilisation trend chart

Add a matplotlib bar chart showing my monthly utilisation % over the full
dataset. Each bar = one month, coloured green if above 75%, amber if 50–75%,
red if below 50%. Save it as utilisation_trend.png.

2. Busiest and quietest weeks

Extend the tool to print my 5 highest- and 5 lowest-utilisation weeks,
with the date range of each week and the top project worked on that week.

3. Project timeline

Add a --timeline flag that prints a text-based Gantt chart showing which
projects were active in which months, based on the first and last entry
for each project in the file.

4. Test-first variant

Go back to Step 1 and this time, before clicking Start Implementation, add:

Before implementing, write failing unit tests that cover:
- Correct utilisation calculation when absence rows are present
- Correct handling of a project with only Non-Client hours (utilisation = 0%)
- --from / --to date filter correctly excludes out-of-range rows
- A graceful error message when the input file is missing
Don't implement the logic yet.

Then hand off to Agent and watch it make the tests pass.


Next Steps

  • Try the same workflow on a real task from your own work
  • After the session, ask Copilot to generate the spec document from the plan - this is how spec-driven development starts to happen naturally
  • Explore custom agents to encode your team's preferred planning style into a reusable agent