At the start of a steer-by-wire program, the bottleneck is rarely the bench test itself. The actuator runs its characterisation sweeps, the instruments record, and by the end of the day there is a pile of data sitting on a drive. Then the real work starts. Five separate instruments watched the same runs, each writing in its own format, its own units, and its own clock, and none of it lines up. Before a single result can be trusted, all of it has to be made to agree. That step almost never happens in one sitting. It gets picked up between other tasks, stretched across a busy week, and it is the first thing to be rushed when the schedule is tight.
This is a walkthrough of that first step on a real-shaped program: one actuator, 27 characterisation runs, five instruments, 135 raw files. The data landed on the drive in the evening. By the next morning it was one clean dataset, with the first spec checks already run on every run.
First, the time, since it is the question everyone asks. Done by hand, getting five sources into one analysable dataset is a few days of careful, easy-to-get-wrong effort, and in practice it spreads across most of a week. Here it ran on its own overnight, on all 27 runs at once, and the campaign came back clean apart from two findings, both isolated and ready for an engineer to judge. This is the data tax the series is about (the finding, cleaning, and aligning that has to happen before any analysis can begin), and it is heaviest right here at kickoff, when there is the most data and the least agreement about it.
The data, before it can be read
Each of the five sources is correct on its own. They simply do not describe the test the same way. The rig DAQ writes MF4 at 1 kHz in clean engineering units. The steering robot logs its command angle in radians and its rack position in millimetres. The torque and angle sensors export raw counts. The two redundant ECUs (there for the safety architecture) each write their own CSV with their own channel names, and one of them reports motor current where the spec is written in torque. On top of all that, the clocks drift by a few seconds between sources.
Turning those five views into one dataset (shared channel names, one set of units, one time base) is the cost that gets paid before any engineering question is even asked.
What ran overnight
None of the steps here are new. Every validation engineer knows how to read an MF4 file, convert units, or line up two clocks. What changes is that all of them run, in order, on every run, without a person carrying files from one tool to the next.
The pipeline ingested the 135 files and mapped them to one canonical set of channels, estimated each source's clock offset and resampled everything onto a shared 100 Hz grid, ran sensor QC across all 17 channels, found the stabilised hold points, and then computed the six spec metrics on every run.
By the time anyone arrived, the cleaned dataset, the QC report, and the first spec results were already on the shared drive. Nothing waited on someone being free to run it.
Where the actuator stands
With the data finally speaking one language, the first real question can be asked: how does the actuator measure up? Every run is checked against all six metrics at once, so the whole campaign reads at a glance.
The headline is reassuring. Four of the six metrics pass on every applicable run: steering ratio holds at 15.0, and motor torque and current stay well inside their hardware limits. That leaves five failures, and they fall in just two metrics. Both clusters are tidy enough to hand straight to an engineer.
Two findings to hand back
The first is feedback torque at full lock. Three runs fail, and all three are at 80 °C, measuring about 5.38 N·m against a 5.0 N·m upper limit. Every other temperature sits comfortably inside the band, so this is a temperature trend rather than random scatter. It also runs about 20% hotter than the feel-curve model predicts, which points straight at the model's temperature coefficient as the thing to review.
The second is on-center hysteresis. Only two runs exceed the 0.4 N·m limit, both at the same condition (25 °C and 13.5 V), and both carry operator notes (one was a rerun of an earlier attempt, the other had a clunk reported on its first cycle). The other runs at that exact condition pass cleanly, so this reads as run-specific, most likely a fixture or settling issue rather than a property of the actuator.
Why this matters at kickoff
The five instruments will always arrive in five formats; the bench does not care about your tooling. What changes is who pays to reconcile them, and when. Here the cost is paid once, by the system, on the first night, instead of by an engineer across the first week. And because it ran in the shared data layer, the cleaned dataset, the checks, and the method that produced them stay where the next engineer can find them, rather than living in one person's scripts on a drive nobody else can open.
The agent does not sign anything off. It assembled the evidence and ran every check the same way on every run, then handed back two clear findings and a campaign that is otherwise clean. That is the point of paying the data tax up front: the first week of the program goes to reasoning about the hardware, not to teaching five instruments to speak the same language.
Next in the series: the fault-injection campaign, where the number of runs (not the formats) becomes the problem. If the week that disappears before the first plot sounds familiar, we would like to talk: founders@movedot.com, or www.movedot.ai.
