Hawaii-as-code: why I put a state's worth of parcels in Git instead of a database
John C. Thomas
Founder, BlueWave Projects
Most production geospatial systems live in a database. PostGIS on Postgres, a normalized schema, a tile server in front of it, an API for the apps, a cache for the reads, and an ETL job that keeps everything in sync with the upstream sources.
That's a reasonable architecture for a typical city or county dataset that updates daily and grows continuously. It is also a *lot of plumbing* for a problem that, in our case, doesn't actually call for it.
Hawaii-as-code is the alternative I built: every TMK parcel, every building footprint, and every address in Hawaiʻi is encoded as TypeScript modules and committed to a single Git repo. 384,262 statewide parcels. 239,458 Honolulu building footprints with real heights. 204,775 address points. Four islands. All of it sits in one repository under portofcams/hawaii-as-code. The 3D map at maps.ikenagroup.com is the static-export rendering of the same source files.
This post is about why that architecture is the right call for *this* shape of dataset, and the specific tradeoffs that come with it.
The problem
Hawaii's authoritative geospatial data is scattered across half a dozen sources:
Every Hawaii-aware product the studio ships (Property Brief, Aloha Network, AI scope generator, ProBuildCalc, Hawaii Property Lookup) needs subsets of this data, fresh, fast, and consistent across products. The conventional architecture would mean: ingest from six sources nightly, normalize into a single schema, serve through one API, cache aggressively, and version the schema as the upstream sources drift.
That's roughly $500/mo of infra and ~2,000 lines of glue code per product that imports it.
The alternative: commit the data
The dataset is finite. The whole compressed corpus — every parcel, building, and address in Hawaiʻi — is well under a gigabyte. It changes on a weekly cadence, not minutely. There are no joins that require SQL-level optimization for this scale.
So I treated it like source code, not like data.
One file per island per layer. Directory tree:
parcels/
oahu.ts
maui.ts
hawaii.ts
kauai.ts
buildings/
honolulu.ts
hilo.ts
addresses/
oahu.ts
...Each file exports a typed array of records. Every record has the same TypeScript shape regardless of which county it came from — normalization happens at ingest time, not at read time.
One scraper per source. Every Saturday, a cron job hits all six upstream endpoints in parallel, normalizes the responses, and writes the result back to the repo as fresh TypeScript modules. The scraper commits the change with a structured message: refresh: oahu parcels (+47 / -12 / ~204).
Diffs are the audit trail. Because the data is text, every weekly refresh shows up as a Git diff. Forty new parcels were added in Kailua? You can see them in a PR. A building height changed from 28' to 32'? Visible in the diff. No ETL black box where "the data refreshed, no idea what changed."
Three superpowers that flow from this
1. The same source feeds every product without API drift.
Property Brief, the scope generator, ProBuildCalc, and the lookup tool all do import { oahuParcels } from "hawaii-as-code/parcels/oahu". There is no API between products. No versioning headaches when one product needs a field that another doesn't. No "tile server is down, app is broken" outages. The data layer is a compile-time dependency. If it compiles, it works.
2. Static export becomes the 3D map.
The 3D map at maps.ikenagroup.com is *just* the TypeScript compiled into a Three.js scene at build time. Buildings are extruded from the footprint arrays to their real heights. Parcels are rendered as clickable polygons. There's no map server, no runtime API for the map itself. Cloudflare Pages serves the whole experience as static assets. Page weight: about 18 MB across the four islands; load time on a good connection: ~2 seconds for the first paint with progressive enhancement.
3. The codebase is the data ops team.
When a new field needs to be added — say, "is this parcel in a special management area?" — I edit the scraper, push, and the next Saturday refresh fills the field in. The whole "data engineering" surface is the same code surface as the application. There's no separate team, no separate vocabulary, no separate deploy pipeline. Claude Code edits the scrapers the same way it edits the React components.
The honest tradeoffs
This architecture is *not* right for every dataset. The specific shape that makes it work here:
Things that *don't* work:
When to steal this pattern
Specifically when:
For Hawaii — finite, slow-changing, multi-source, multi-product — this architecture is the right call. For the contiguous US it isn't. For a real-time logistics feed it isn't. For PII-bearing customer data it absolutely isn't.
But if your dataset *does* fit, treating geospatial data like source code instead of like database rows is one of the cleanest architectural moves I've made in 12+ shipped products. The whole "data ops" team is a Saturday cron job and a PR review.
More from BlueWave
RoomPlan vs Matterport vs Polycam: which one belongs in your contractor's toolkit?
8 min
Hawaii complianceHawaii GET tax for contractors: how the §237-13(3)(B) sub-deduction actually works
6 min
WorkflowHow to scope a renovation in 60 seconds (and why your hand-written estimate keeps losing jobs)
5 min