Writing a Stateless Translation Service in Go
Writing a Stateless Translation Service in Go
At my last job I worked at a company that builds AI-powered damage assessment software for the automotive insurance industry — insurers and body shops use it to process claims, calculate repair costs, and get damage evaluations done in minutes instead of days. Part of that means integrating with external repair cost calculation APIs — legacy formats, SOAP-based, with their own taxonomy for parts, labor times, and tax conventions. These formats don’t look anything like a clean internal domain model.
The first time we integrated with one, the translation logic went directly into the service that needed it. That was fine. The second time, a different service needed the same format. We copied the logic and adapted it. By the third time, we had three slightly different versions of the same conversion code, and no clear owner when something broke.
The fix was to pull all of it into a single dedicated service. One Go microservice that knows the external format. Everything else talks to it. The rest of the platform stays clean.
Why stateless
The service has no database, no queue, no in-memory state. A request comes in, it transforms the payload, it responds.
This made the deployment story simple. No migrations, no drain period, no coordination between replicas. Deploy a new version, the old instances handle their in-flight requests and shut down, the new ones start picking up traffic. We’ve never had a deployment incident with this service.
Horizontal scaling is also a non-issue — each instance is identical, so adding more behind a load balancer just works.
SOAP in Go
The external APIs use SOAP — not REST, not gRPC, full XML envelopes with a WSDL spec. Go’s standard library has no SOAP client.
I generated one from the WSDL using gowsdl. The generated code gave me the struct definitions and basic transport layer. I then layered custom marshaling on top for the quirks the generated code couldn’t handle cleanly. WSDL specs describe what a format is supposed to look like; the actual API has undocumented behaviors you only discover when requests start failing in production.
During the early debugging phase, I built a thin wrapper that logged every raw XML request and response. Expensive to run all the time, so I added a flag to enable it on demand. That log is how I found three field-format discrepancies between the spec and what the API actually expected.
The merge problem
The external format doesn’t allow duplicate position codes. Our internal model does — a vehicle door might appear in both a scratch assessment and a dent assessment. When translating, those entries have to be merged into a single position.
Getting the merge semantics right took a few iterations. Which field takes priority when two entries conflict? How do you combine labor times? What if the tax rates differ? The solution was a two-pass approach:
// Pass 1: group by external position code
grouped := groupByCode(positions)
// Pass 2: merge each group into one
for code, group := range grouped {
merged := mergePositions(group)
result = append(result, merged)
}
All the precedence rules live in mergePositions. When the rules changed — and they did, several times as edge cases surfaced — there was one function to update and one set of tests to run.
Image handling
Some integrations require damage photos attached to requests. Mobile clients send HEIC files, often 10–15MB. Pushing that over a SOAP request to a legacy API is slow and often rejected outright.
The service uses govips (a Go wrapper around libvips) to decode HEIC, resize to the maximum dimensions the receiver accepts, and re-encode as JPEG before base64-encoding it into the SOAP body. libvips is fast enough that image processing adds no noticeable latency to the overall request. Running ImageMagick or pure-Go image processing would have been a different story.
Observability
A middleware chain handles cross-cutting concerns: RequestID → Auth → Billing → Metrics → Handler.
The metrics middleware records latency and error rates per external integration. When one of the APIs silently changed a field format on their side, I saw a spike in error rates for that specific connector before any user reported a problem — timestamped, scoped to the right integration, actionable without digging through logs.
The billing middleware was worth thinking through carefully. Each company has a usage ledger in the main platform. The naive approach — a synchronous call to the billing API on every translation request — adds latency and creates a hard dependency on billing availability. Instead, the middleware batches usage events in memory and flushes them asynchronously. Billing stays accurate; the translation service doesn’t break if billing is briefly unavailable.
What isolation actually buys
When the external API changed its authentication scheme, the change was one pull request in this service. The main platform didn’t redeploy. The job workers didn’t change. Nothing that calls the service changed.
When we added a second external calculation provider with a completely different format, it was a new adapter registered in this service — not a new integration scattered across four codebases.
That’s the real value here. Not Go specifically, not the SOAP handling — the fact that when things change upstream, one team touches one service and the rest of the platform doesn’t notice.