Los Angeles city departments processed more than 14 million digital document images last fiscal year, and a growing share of those files are exact or near-exact duplicates sitting in overlapping databases across at least a dozen separate agencies. The Bureau of Engineering, the Department of Building and Safety, and the Los Angeles Housing Department have each flagged the problem internally, and a coordinated deduplication push — formally part of the city's broader GovTech modernization drive — began in earnest in the first quarter of 2026.
The timing is not accidental. With the 2028 Summer Olympics less than two years out, city officials are under mounting pressure to streamline digital permitting, accelerate construction sign-offs in venues stretching from Exposition Park to the Rose Bowl in Pasadena, and ensure that contractors can pull accurate, single-source records without wading through redundant scanned imagery that inflates file retrieval times and generates conflicting version histories. A duplicate image in a structural permit file isn't merely a storage nuisance — it can trigger a re-review cycle that adds days to an approval queue already measured in weeks.
What Deduplication Actually Looks Like on the Ground
The city's Information Technology Agency, headquartered downtown on Spring Street, is running the core technical effort. The program uses perceptual hashing — a method that assigns a unique fingerprint to each image and flags visually identical files regardless of their filename or metadata — rather than simple byte-for-byte comparison. That distinction matters because scanned permit drawings submitted through the LA City Planning online portal frequently arrive with different compression settings from different applicants, making them technically distinct files even when the underlying image is the same drawing submitted twice.
The Los Angeles County Metropolitan Transportation Authority, which shares some document repositories with the city for joint infrastructure projects along the Purple Line Extension corridor, began its own image deduplication audit in March 2026. Metro's digital asset library had ballooned to roughly 800 terabytes by late 2025, with internal estimates suggesting that somewhere between 18 and 22 percent of stored images were redundant — figures that, if accurate, represent a meaningful chunk of annual cloud storage costs running into the hundreds of thousands of dollars.
Eastside neighborhood offices in Boyle Heights and El Sereno have also reported faster turnaround on housing inspection records since the Housing Department began the first phase of its cleanup in February, removing flagged duplicates from the Karen Bass administration's HomeSolution case management system, which tracks interim housing placements tied to the ongoing homelessness emergency declaration.
How L.A. Compares to London, Singapore and New York
London's Government Digital Service published a framework for public-sector image deduplication in 2023 and mandated its adoption across Transport for London and the Greater London Authority by the end of 2025. Singapore's Smart Nation initiative embedded automated deduplication into its whole-of-government document management system as far back as 2021, setting a benchmark that city IT administrators in both Los Angeles and New York have cited in planning documents.
New York City launched its own deduplication initiative under the NYC Office of Technology and Innovation in late 2024, concentrating first on the Department of Buildings — a direct parallel to L.A.'s focus on permitting infrastructure. New York's effort covers approximately 22 million legacy scanned documents, a larger corpus than L.A.'s, but city technology officials there have acknowledged the rollout has moved more slowly than projected because of union contract provisions governing records management work.
Los Angeles has an advantage in its relative youth of digital infrastructure: much of the city's document scanning was done after 2015, meaning fewer paper-era artifacts and more consistent image formats. That said, the patchwork of agency-level systems — some running on platforms bought in the early 2000s — still creates interoperability headaches that no hashing algorithm fully solves on its own.
The ITA expects to complete Phase One of the deduplication project, covering Building and Safety and City Planning records, by October 2026. Departments that have not yet onboarded are encouraged to contact the ITA's Digital Services Division directly; the program is open to any city agency managing more than 500,000 stored image files. For residents trying to pull permit history on a property in neighborhoods like Silver Lake or Panorama City, the practical payoff should be faster online search results and fewer instances of conflicting document versions appearing in the same query.