COMMAND · VESSEL MANAGEMENT SOFTWARE
Trained reviewers had processed these documents for years. The AI system surfaced critical information that had previously gone undetected.
A five-month prototype proved the capability, established the accuracy benchmark, and produced a finding that changed how the organisation thinks about AI in its workflow, with a tool now running in production as an internal fact-checker.
A maritime software platform onboarding growing vessel volumes across multiple regions. At each onboarding: OEM manuals with inconsistent layouts, different formats per manufacturer. All extraction done by hand.
Southern Sky AI built a production-grade AI extraction system in four weeks and tested it for five months. The system achieved 85% extraction precision on unstructured documents with no standard format. In doing so, it produced an unexpected finding: AI consistently surfaced critical information inside those documents that experienced human reviewers had not found.
The system runs in production as an internal fact-checking tool. 85% precision. A defined benchmark. A tool in active use. And a discovery about what AI finds inside complex documents that skilled human review does not.
The Organisation
A maritime software company serving vessel operators globally. The platform manages maintenance, compliance, and operational records under ISM Code and Safety Management System obligations, used by captains, engineers, and shore-based management teams across the international fleet. Core onboarding workflow: extract vessel data from source documents, structure it within the platform. As vessel volumes grew, this step became the bottleneck.
The Situation
OEM manuals. No standard structure. Each formatted to a different manufacturer's conventions, with data distributed across inconsistent layouts, tables, and formats. Skilled operators locating and transferring data by hand, document by document, component by component. Time cost compounding with every vessel added. The question brought to Southern Sky AI: could AI take on that extraction work reliably, and at what accuracy level?
The Work
Production-grade AI extraction system built in four weeks. Advanced AI reasoning model selected for its ability to parse unstructured formats and apply consistent structure to inconsistent inputs regardless of manufacturer layout. Parallel processing for multiple components simultaneously. Clean re-run logic so failed components could be retried without reprocessing completed work. Three operator touchpoints to minimise training requirements and reduce failure modes.
Five months of prototype testing against representative materials. The acceptance standard was held throughout rather than adjusted to match early results. That discipline produced the finding that made the engagement valuable beyond its original scope.
The Outcome
85% extraction precision on documents with no standard structure. In the process, the system consistently surfaced critical information inside the source documents that experienced human reviewers had not found. Data that was always there. Information no reviewer had reached. For a platform managing operational records under safety and compliance obligations, that discovery carried immediate implications.
The system runs in production as an internal fact-checking tool. The organisation holds a defined accuracy benchmark, a documented proof of capability, and a concrete picture of what AI can do inside these documents that trained human review cannot.
The Standard Applied
The acceptance standard was built into the engagement before a single line of the system was written. Holding that standard through five months of testing, rather than adjusting expectations to match early results, is what produced the finding. A defined benchmark, honestly measured, is how an organisation learns what the technology can actually do and what it needs to do next.

