Reconciling $400K of monthly ad spend - a Laravel reporting build for an 18-location client

The brief was disarmingly simple: every Monday morning, the marketing team needs to know what last week's ad spend was, broken down by channel, by region, by club, and by campaign, with the same numbers the platforms themselves report. No surprises, no "let me check that and get back to you," no spreadsheet that someone has been hand-editing for three years.

The client runs eighteen fitness club locations across two regions. Monthly ad spend across Meta, Google, LinkedIn, and TikTok lands somewhere between $300,000 and $500,000 depending on the season. The risk is not that one of those platforms is wrong - they're not. The risk is that across a dozen Google MCC sub-accounts, a Meta CSV export, a LinkedIn API pull, and four different naming conventions for "campaign," the combined view is wrong by 3% and nobody notices for two months.

3% of $400,000 is $12,000 a month. $144,000 a year. That's the size of the problem you don't see.

The shape of the build

A Laravel 12 dashboard with a MySQL backend, deployed to managed hosting. Four data sources flowing in: three via OAuth APIs, one via CSV imports. Six reporting views: All Channels, Budget Spend, Daily Pacing, Region, Club, Channel. Two admin tools: an OAuth connection manager and a manual upload form for the days the APIs are out for maintenance.

That's the surface. What makes it interesting is what's underneath - the data flow that holds together when one platform goes down or starts returning slightly different campaign names than yesterday.

The "one source of truth per platform" rule

When this dashboard started, every platform was being imported via CSV. The marketing team downloaded weekly exports from each platform, dropped them in a shared folder, and a scheduled job parsed them into platform-specific tables.

This was fine until two things happened: the CSV download steps started getting skipped on busy weeks, and the platforms started offering APIs that returned the same numbers without anyone having to download anything.

So we started rebuilding the pipeline platform-by-platform to pull via API. But here's the trap nobody warns you about: if you don't retire the CSV pipeline when you build the API pipeline, both sources write to the same combined table, and now you have race conditions.

The rule I landed on: exactly one source of truth per platform. For Google, LinkedIn, and TikTok, the API is authoritative. For Meta (which doesn't yet have a clean API path in this stack), the CSV remains authoritative. The two pipelines write to the same downstream tables but never the same rows - the unique keys ensure no overlap, and the scheduler is configured to never run a CSV import for a platform whose API is now authoritative.

The most common bug I see in dashboards like this is when somebody adds an API integration for a platform that was previously CSV, doesn't switch off the CSV scheduler, and now two jobs are writing to the same table - one with slightly cleaner data, one with slightly older data. Whichever runs last wins, and the numbers oscillate.

The three-layer data flow

Once data is in, three layers hold it:

Raw API table - one row per platform / date / campaign. Cleaned, deduplicated, but not yet enriched with business logic. Unique key prevents duplicates.
Channel-level rollup - one row per channel / date. The "executive summary" layer, fast for the high-level KPI tiles. Channel names stored lowercase here.
Campaign-level rollup - one row per channel / date / campaign, enriched with club and region mappings. The "drilldown" layer for the detailed reports. Channel names stored uppercase here.

Yes, the channel name case differs between layers. That's a quirk worth documenting somewhere prominent (and we did), because every report has to know which case to query against. The original reason was historical - the first developer used uppercase in one place and lowercase in another, and rather than rewrite three years of reports we wrote a transformation step. "Don't fight the conventions you inherited" was a useful rule here.

The Google MCC twist

Google Ads adds an extra layer of complication: the client account is one of thirteen accounts under a Manager (MCC) account. The API will happily return data for all thirteen if you ask it to. That's both a feature and a footgun.

The footgun: if you ever forget to filter, your weekly Google Ads spend triples because the dashboard is summing twelve other clients' spend along with this client's. Easy to miss because the numbers still look reasonable - Google is a big platform.

The fix: a single line in the channel-mapping config sets account_filter to the specific client account name, and the sync job filters to just that one account before writing to the combined tables. The raw API table still receives all thirteen accounts - useful for future reporting if the agency takes on more clients - but only the relevant account reaches the dashboard.

Why we don't average percentages

The Impression Share report has the most interesting math in the build, and it's the one place where the "obvious" implementation is wrong.

Suppose you have ten campaigns. Each one has a Search Impression Share for the period. You want to show "the account's overall Search IS." What do you do?

The intuitive answer is: average the ten percentages. Add them up, divide by ten.

This is wrong. Each campaign has a different impression volume - one campaign might have a million impressions and 60% IS, another might have a thousand impressions and 90% IS. Averaging them weights them equally, when in reality the million-impression campaign represents a thousand times more weight in the account's overall picture.

The correct math:

SUM(search_impr_share * impressions) / NULLIF(SUM(impressions), 0)

It's a weighted average. Every tile, every rollup, every benchmark in the IS report uses this formula. Forgetting it produces numbers that look reasonable but are wrong by 10-20% in either direction depending on campaign mix - the kind of error that's invisible until somebody compares the dashboard to the Google Ads UI and asks why they disagree.

Benchmark anchoring - the bug that took a week to spot

The Club dimension page shows a metric ("Conv. Rate") for the selected date range, plus three benchmark numbers: 7-day, 30-day, year-to-date.

The naive implementation anchors those benchmarks to today. So if the user is looking at April, they see April's conv rate plus the rolling 7-day-from-today and 30-day-from-today benchmarks.

That's wrong, and it's wrong in a subtle way. The user is reviewing April. They want to know how April compares to a 7-day-around-April and 30-day-around-April benchmark, not to today's benchmark. Showing today's benchmark next to April's numbers is comparing two different time periods, and the conclusions you draw are nonsense.

The fix is one line: anchor every benchmark window to the end date of the selected range, not now(). The 7-day, 30-day, and YTD windows all trail backwards from $to. Now when the user reviews April, the benchmarks are also April-anchored.

This was a week of "why does the YoY comparison look so weird" before we caught it. It's now a permanent comment at the top of the Club controller.

The ghost-row bug

The single most-frustrating production issue this dashboard has hit was the ghost-row bug.

Here's how it goes: Google's API returns campaign data with both an ID and a name. Most of the time the name is real ("Brand - Region - Search"). Occasionally, when the API can't resolve the name (deleted campaign, sub-account permissions, brief outage), it returns the literal string "Campaign 490180104" - the ID rendered as the name.

The unique key on the combined table is (channel, date, campaign). The same campaign on the same date with two different names becomes two rows. Both have the spend. The downstream sum is doubled.

The detection took weeks. The fix is a one-line SQL filter that deletes rows where the campaign name matches the pattern "Campaign [0-9]+" before the sync runs. Anything matching that pattern is a ghost row by definition.

It's now in the platform onboarding checklist: before connecting a new API, scan the existing combined table for ghost rows, delete them, then sync. Otherwise the first day of API data inflates against the legacy CSV data and nobody knows why.

Reconciliation, three layers deep

What makes this dashboard trustworthy is not that the numbers look right - it's that any number can be traced backwards through three layers to a primary source.

A row in the campaign rollup table can be sourced back to a row in the channel rollup table. That row can be sourced back to one or more rows in the raw API table. That row can be sourced back to the API's own response for that date and campaign. Which can be reconciled against the platform's own UI.

When a number on the dashboard looks weird, the question is never "is the dashboard right?" - it's "where in the four-layer chain did the wrongness start?" That's a question you can answer in minutes. "Is the dashboard right?" is a question that takes weeks.

A note on the AI assistance layer

The Club page has a panel that lets a marketing manager request an AI assessment of the club's recent performance. The assessment is an LLM-written analysis of the actual numbers - what's up, what's down, what to look at next - and it gets saved as an actionable item in a follow-up queue. Status cycles through open, actioned, resolved, and discarded.

The same rule from an earlier post applies here: AI gets to write the commentary, not the numbers. Every figure the assessment references is pulled from the database before the prompt is built and inserted verbatim. The model only sees what's already true and is explicitly told not to invent numbers. The follow-up queue then makes those assessments operational - they're not just a paragraph, they're a unit of work that can be picked up and closed by a real human.

This is the version of "AI in marketing reporting" that actually moves the work forward rather than producing impressive-sounding paragraphs nobody acts on.

What I would change

Some things on the wishlist that aren't built yet:

A nightly reconciliation report. For each platform, pull the platform's own UI total for yesterday, compare to the dashboard's total, alert if they differ by more than 1%. This is hard because the platform UIs don't all have clean APIs for "give me yesterday's totals," but it's exactly what the dashboard should self-monitor for.

Switch to a managed cloud database. The current host's MySQL has done the job, but the deployment constraints (no git pull on the server, must use the dashboard UI) and the OPcache stale-bytecode issue suggest that a managed Postgres or BigQuery instance with a more flexible deploy story would save real time over a year. The cost difference at this data volume is negligible.

More aggressive caching at the report layer. Some of the heavier reports re-query a year's worth of data on every page load. With careful invalidation, most of these results could be cached for an hour. Not blocking, just future quality-of-life.

But the core promise - that every Monday morning the marketing team gets the same numbers the platforms report, that any number can be reconciled to a primary source, that the dashboard hides nothing and surfaces everything - has held up across a year of weekly use. That's the version of "working software" that actually matters in this work.