The City of Los Angeles is sitting on an estimated 14 million scanned document images stored across at least three incompatible platforms — and a significant share of those files are duplicates, some appearing as many as six times under different file names, according to city IT staff who briefed the Department of Building and Safety in a March 2026 internal session. The problem did not appear overnight. It is the product of roughly 30 years of piecemeal digitization decisions, each made in isolation, that are now colliding with the most consequential infrastructure deadlines the city has faced in a generation.
The timing could hardly be worse. Mayor Karen Bass declared a housing emergency in January 2023, triggering an accelerated permitting pipeline that relies on fast document retrieval. Meanwhile, the city's 2028 Olympic and Paralympic Games infrastructure program — centered on venues from SoFi Stadium in Inglewood to the Los Angeles Memorial Coliseum in Exposition Park — requires contractors to pull historical site permits, environmental assessments, and easement records at speed. Duplicate images clog those searches, returning redundant results and forcing clerks to manually verify which version of a document is the authoritative one.
Three Systems, No Single Source of Truth
The root of the problem traces back to 1994, when the city first began scanning paper records at the Kenneth Hahn Hall of Administration on Grand Avenue. That early system, built on technology common to that era, stored TIFF images in a proprietary format. When the Bureau of Engineering launched its own digitization effort in 2003, it chose a different vendor and a different file-naming convention. The Department of City Planning added a third layer in 2011 when it migrated to a web-based portal called ZIMAS — the Zoning Information and Map Access System — which imported portions of the earlier archives without deduplication protocols in place.
Each migration pulled records forward but left the old copies intact. Staff at the Planning Department's Figueroa Street offices have described the problem in public commission meetings as a workflow drag, particularly for projects in high-demand corridors like Vermont Avenue in East Hollywood and the Crenshaw District, where Bass administration Accessory Dwelling Unit fast-track applications require staff to verify lot histories going back to the 1950s. Without reliable deduplication, a single ADU application can generate a document queue that pulls the same 1962 variance approval four or five times.
The Los Angeles City Controller's office flagged storage redundancy as a budget concern in its fiscal year 2024-2025 audit, noting that the city was paying for cloud storage capacity inflated by duplicate assets across multiple departments. The audit did not assign a precise dollar figure to the duplication problem specifically, but placed total unoptimized cloud storage expenditure across city departments above $11 million annually as of June 2025.
What the City Is Doing — and What Still Has to Happen
The ITA — the city's Information Technology Agency, headquartered on Main Street downtown — began a formal Duplicate Image Replacement initiative in late 2025, contracting with a records management firm to run hash-matching algorithms against the Building and Safety archive as a pilot. The goal is to identify canonical versions of each document, tag duplicates, and retire redundant files to cold storage rather than delete them outright, preserving legal chain of custody. The pilot covered roughly 800,000 records in the first phase, completed in April 2026.
The remaining 13-plus million documents across Planning, Engineering, and the City Clerk's archive represent the harder lift. ITA has requested a budget allocation in the fiscal year 2026-2027 cycle to expand the program, but as of July 4, that line item had not cleared the Budget and Finance Committee.
For residents and contractors working under the Bass housing emergency or the Olympic timeline, the practical advice from city records staff is specific: when pulling permit histories through the ZIMAS portal or the ePlanLA system, flag duplicate returns using the online feedback tool, which feeds directly into the ITA pilot's training data. Every flagged duplicate reported through that channel, staff say, accelerates the algorithmic model's accuracy — and potentially shortens the permit queue for the next applicant behind you.