Why most people counting pilots fail to prove anything
A people counting pilot is supposed to settle a procurement question. Does this system count accurately enough to trust the number, does the data move a real operating decision, and is the cost worth the value it returns? In practice, pilots often end inconclusive. The team installed sensors at a single door, watched a dashboard for a few weeks, declared the experience interesting, and then could not make a confident recommendation to the budget holder. The vendor disappears, the spreadsheet sits in a shared drive, and the procurement question gets pushed to the next quarter.

The cause is almost always the same: the pilot was not designed as a measurement. It was designed as a demo. A demo answers "does it look like it works?". A pilot answers "does it return enough value, with enough confidence, that we should roll it out?". Those are different questions and they need different structures.
This post sets out a four-week structure for a people counting pilot that produces a defensible ROI answer at the end. It covers scope definition, baseline establishment, success criteria, the data hand-off format, the internal stakeholder cadence, and the common pitfalls that quietly turn a pilot into a demo. The structure works for retail, malls, transport, cultural sites, and public-realm programmes; the specifics of what you count change, the method does not.
Define the scope before you order a sensor
Scope is the single decision that determines whether the pilot can produce a usable answer. It has four parts, and each one should be written down and signed off before any hardware ships.
The decision the pilot serves
Start with the operating decision the count is meant to inform. "Should we extend Sunday opening?" is a decision. "Are we overstaffed on Tuesday mornings?" is a decision. "Should we keep this anchor space leased to the current tenant?" is a decision. "Do we have enough capacity to run the late-night programme?" is a decision. "We want to understand our customers better" is not a decision; it is a vibe. A pilot tied to a real decision has a clear pass-fail at the end: the data either changed the call or it did not.
The zones that matter
Pick the smallest set of zones that can answer the decision. For a single store, that is usually the entrance plus one or two interior zones. For a mall, that is the entrances plus the anchor frontages and a couple of common-area choke points. For a tourist attraction, that is the main entry plus the exhibition rooms or galleries that drive the ticket. Resist the temptation to count everything. Each extra zone adds a sensor, a wiring run, a calibration step, and a column on the report. The aim is a clean answer to a specific decision, not a complete picture of the building.
The comparison the pilot will draw
A count on its own answers nothing. The pilot needs a comparison built in. Three comparisons cover almost every case.
- Before and after. An intervention is planned during the pilot, a layout change, an extended opening hour, a new staff schedule. The pilot measures footfall, occupancy, and dwell before the change and after it, in the same zones at the same hours.
- Same site, two periods. No intervention is planned, but the pilot covers a representative four weeks against a known reference, the same four weeks a year earlier (if a clicker or POS proxy exists), or the same days of the week back-to-back with weather and event days flagged.
- Two sites, same period. A multi-site organisation pilots in two locations with different characteristics. The point is not to rank the sites; it is to show that the data exposes the differences the operations team already suspects and lets them quantify the gap.
The duration
Four weeks is the right length for most pilots. It is long enough to cover four of each weekday, capture at least one event or weather anomaly, and let the team see the rhythm of the data rather than a snapshot. Shorter pilots get fooled by a single bad week. Longer pilots drift into being the deployment and the success criteria get forgotten. Pick a four-week window that does not straddle a major school holiday or public-holiday week, unless the holiday is itself the decision being studied.
Establish the baseline in week one
A pilot without a baseline is a chart with no axis. The first week of the pilot exists to set that axis: how the site behaves before any change, how the sensor compares against a known reference, and how the team will read the data going forward.
Validate the sensor against a known reference
On day one or two of the pilot, run a manual count alongside the sensor for two short windows, a quiet morning hour and a busy afternoon hour. A staff member with a clicker at the door, or a video review of an existing CCTV feed already on site, will do. Compare the manual count to the sensor count for the same window and record the percentage difference. A well-calibrated counter sits well within single digits of the manual count. If it does not, the sensor needs to be re-aimed or recalibrated before the pilot starts producing the data the decision will rest on. Doing this once, in writing, removes the "but is it really accurate?" question from every later meeting.
Capture a typical-week shape
Use week one to read the natural rhythm of the site: which hours are busiest, where the dwell peaks sit, when the building empties. Plot a weekday-by-hour heat map. Plot a weekday-versus-weekend split. The team should be able to point at the chart and recognise the site. If they cannot, the sensor placement is wrong or the zone definitions are unclear and the rest of the pilot will be working from a broken baseline.
Lock the baseline metrics in writing
By the end of week one, lock the baseline values you will compare against in writing. For a retail pilot that might be average daily entries by weekday, average dwell in the interior zone, and the conversion rate from entries to transactions if POS is integrated. For a mall pilot it might be entrances per door, anchor-frontage capture rate, and common-area dwell. The exact metrics depend on the decision; the discipline is the same. The baseline is the number the success criteria will measure against, so it has to be agreed before the intervention or the comparison period starts.
Write the success criteria up front
A pilot without written success criteria becomes an opinion meeting at the end. Someone in the room declares the data was "interesting" or "useful". Someone else says it was "not what we expected". No procurement decision can survive that conversation. The fix is to write the success criteria before week two and circulate them to the budget holder.
Strong success criteria have three properties.
- They are written as a threshold, not an aspiration. "Accuracy within five per cent of manual count across two validation windows" is a threshold. "Highly accurate" is an aspiration. Thresholds get a yes or no answer at the end of the pilot.
- They cover both the measurement and the decision. One criterion proves the sensor reads accurately. A second criterion proves the data moves the decision the pilot was set up to serve. A pilot that hits the first but fails the second tells the budget holder exactly what to do with the report.
- They include a financial frame. Even a rough one. "If the data supports a Sunday opening extension that adds five per cent to monthly footfall, the system pays for itself in eight months at the quoted rollout cost." The financial line is what turns a pilot conclusion into a budget request the finance team can act on.
Three to five criteria is usually right. More than that and the report becomes unreadable. Fewer and the pass-fail is too coarse for the budget holder to defend in the meeting where the decision is actually made.
Agree the data hand-off format
Vendors love their dashboards. Internal teams use the tools they already have: a planning spreadsheet, a Tableau workbook, a Looker view, a PowerBI report. A pilot that lives only in the vendor portal will be looked at twice in week one, opened once more in week three, and never again. The data has to land where the team works.

Before the pilot starts, agree on three deliverables.
- A daily CSV or API feed. Hourly counts by zone, occupancy by zone where relevant, dwell by zone where relevant, with a clear timestamp standard and a column dictionary. This is the data that gets pulled into the spreadsheet or BI tool the team already uses.
- A weekly read-out PDF or notebook. Three to five charts the operations team can read in two minutes: weekday-by-hour heatmap, daily totals against the baseline, dwell by zone, plus a short text summary of anomalies (weather, events, staff changes). This is what gets shared in the weekly cadence meeting.
- A final pilot report. Twelve to twenty pages: the scope as agreed at the start, the baseline as locked in week one, the success criteria as written before week two, the data against each criterion, the comparison the pilot was set up to draw, and a recommendation. This is the document that goes to the budget holder.
Ariadne delivers all three. The CSV and API feed come from Ariadne Analytics, the weekly read-out is shared as part of the pilot cadence, and the final report is co-authored with the operations and finance contacts on the project. The point is not the deliverable list; the point is that the data is in the team's hands before the pilot ends, so the conclusion is theirs, not the vendor's.
Set the internal stakeholder cadence
Most pilots that go quiet were not technically broken. The sensor counted, the dashboard worked, the data exported. What went quiet was the conversation inside the customer organisation. Nobody owned the weekly read of the data. The operations manager assumed the analytics team was watching it. The analytics team assumed the operations manager was reading the dashboard. The budget holder waited for a recommendation that never arrived. Four weeks later there was a folder of CSVs and no decision.
A working pilot has a named cadence with four roles.
- The pilot owner. One named person inside the customer organisation. They run the weekly meeting, hold the success criteria, and write the final recommendation. They are usually the operations lead or the site general manager, not the IT contact.
- The data lead. The person who pulls the CSV or runs the BI report. They surface the anomalies and prepare the weekly read-out. In a small organisation this can be the same person as the pilot owner; in a larger one it is usually the analytics team.
- The site lead. The person on the floor who flags real-world context: a delivery blocked the entrance for two hours on Tuesday, the lift was out on Thursday, a film crew used the back hall on Friday. Without that context the data has unexplained spikes and the meeting argues about ghosts.
- The budget holder. The person who will sign the rollout. They do not need to be in the weekly meeting, but they should receive the weekly read-out and the final report. A pilot that surprises the budget holder at the end is a pilot that gets shelved.
Run a thirty-minute weekly meeting for the four weeks. Open with the read-out chart, walk through the week's anomalies with the site lead, check the data against each success criterion, and end with one action: what does the team need to validate, change, or watch for next week. Four meetings of thirty minutes is the entire stakeholder cost of a well-run pilot.
Six pitfalls that quietly turn a pilot into a demo
Most of the pilots that fail to produce a clean answer fail in the same handful of ways. Each one is avoidable if the team names it before the pilot starts.
- Counting too much. A pilot that instruments fifteen zones across three sites at once will produce data nobody has time to read. Pick the smallest scope that can answer the decision.
- No reference for accuracy. Without a manual or POS check in week one, every later disagreement about the count becomes a debate the vendor cannot win. Validate the sensor once, in writing, on day two.
- Vague success criteria. If the pilot ends and the team is arguing about whether the data was "good enough", the criteria were never written down. Write the thresholds before week two.
- Data trapped in a vendor portal. If the analytics team cannot pull the data into their existing tools, the pilot will be evaluated on screenshots from a dashboard nobody owns. Insist on CSV or API access from day one.
- No site context in the read-out. A pilot without weekly anomaly notes argues about ghost spikes. Get the site lead in the cadence meeting and have them annotate the week.
- Privacy as a late surprise. A pilot that gets to week three before someone asks "is this GDPR-friendly?" loses momentum fast. Settle the privacy question before any sensor is mounted, and prefer a method that captures no personal data in the first place so the conversation with the data protection officer is short.
How Ariadne approaches a pilot
The structure above is vendor-agnostic and works against any honest counting system. What follows is specific to running a pilot with Ariadne, because the method shapes a few of the choices.
Ariadne measures this with Hybrid Fusion, its patented camera-free method. Time-of-Flight depth sensing counts every visitor at the entrances, capturing geometry rather than images, while patented phone signal sensing follows movement through the interior, detecting the signals a phone emits even in airplane mode. The sensor streams both feeds to Ariadne, where Hybrid Fusion combines them into one trajectory per visit and computes counts, dwell, and paths. The streams carry no identifier: no MAC address, no device ID, no biometric data, and no camera is involved. Identifiers are stored only when a visitor explicitly opts in, which keeps the method GDPR-friendly and outside biometric territory.
For a pilot, three properties of that method matter. First, accuracy is set at the entry by Time-of-Flight depth sensing, so the day-two validation against a manual count is a clean test. Second, the system reads dwell and journeys inside the zone without recording who anyone is, so the privacy question can be settled in week zero rather than week three and the conversation with the data protection officer stays short. Third, the data exports cleanly: hourly counts, occupancy, and dwell by zone over CSV or API, so the customer's analytics team can pull the data into the tools they already use and the pilot does not get stuck in a vendor dashboard. The sensor lineup is set out in the Ariadne hardware overview, the method is documented on the how-it-works page, and the data handling is in the privacy policy. The contact form is where to scope a four-week pilot if the structure in this post matches the decision you are trying to settle.
A four-week pilot, at a glance
- Week zero. Scope agreed in writing: the decision, the zones, the comparison, the success criteria, the cadence, the privacy answer. Sensors arrive and are installed at the chosen zones.
- Week one. Baseline week. Sensor validated against a manual or POS reference on day two. Typical-week shape captured. Baseline metrics locked in writing. First weekly read-out at end of week.
- Week two. Comparison period starts: the intervention goes live, or the second comparison site comes online, or the pre-intervention reference period begins, depending on the design. Weekly read-out two.
- Week three. Mid-pilot check against the success criteria. Anomalies annotated by the site lead. Course-corrections agreed if data is unclear. Weekly read-out three.
- Week four. Final week of data collection. Data feed continues. Final pilot report drafted in parallel: scope, baseline, criteria, data against each, recommendation. Weekly read-out four plus end-of-pilot review with the budget holder.
Four weeks, four meetings, a written scope at the start and a written recommendation at the end. That is the shape of a pilot that produces a procurement answer rather than a dashboard tour.
FAQ
Is four weeks really enough to prove ROI?
Four weeks is enough to prove the measurement, prove the data lands where the team works, and prove the data is moving a real decision. It is not enough to capture full-year seasonality, and the report should say so. The honest framing is that the pilot proves the system is worth deploying; the first year of deployment is where the seasonal baseline is built and the ROI is realised. A pilot that promises a complete annual ROI in four weeks is a pilot promising more than the data can carry.
Do the sensors use cameras?
No. Ariadne counts with Hybrid Fusion: Time-of-Flight depth sensing plus patented phone signal sensing, never cameras. Time-of-Flight captures geometry rather than images, and signal sensing captures no MAC address by default, so the measurement involves no video, no faces, and no biometric data.
What does a pilot cost compared with a full rollout?
A pilot is priced as a short engagement with a small number of sensors and the read-out support around it, not as a fraction of the rollout. The point of pricing it that way is to keep the pilot honest: the customer pays for the measurement, the report, and the time, and the rollout pricing is then negotiated against the data the pilot produced rather than bundled in. Specific pricing depends on zones, sensor count, and whether the customer wants the pilot report delivered as a finished document. The contact form is the right place to start that conversation.
Can the pilot integrate with our POS or BI tools?
Yes, and the cleanest way to do it is to agree the integration in week zero rather than week three. Hourly counts and dwell export as CSV or over API, with a documented column dictionary, so the customer's analytics team can pull the feed into the existing tools and join it against POS or transaction data. Pilots that leave the integration question until the end of week two tend to end with the data trapped in a vendor portal, which defeats the point.
What happens to the data and the sensors if we do not roll out?
The data collected during the pilot belongs to the customer; a clean pilot agreement says so up front. The sensors are removed at the end of the pilot if no rollout decision is made, and the customer keeps the final report and the underlying counts. A pilot is a measurement engagement, not a one-way door, and a vendor that treats it otherwise is the wrong vendor.



