The Daily Los Angeles

Los Angeles news, every day

News

How Los Angeles's Digital Archives Became a Swamp of Duplicate Images — and Who's Now Paying to Fix It

Years of rushed scanning projects, siloed city departments, and budget shortcuts left LA's public record systems bloated with redundant files; now the bill for cleaning it up is coming due.

By Los Angeles News Desk · Published 4 July 2026, 12:00 pm

3 min read

How Los Angeles's Digital Archives Became a Swamp of Duplicate Images — and Who's Now Paying to Fix It
Photo: Photo by Stephen Leonardi on Pexels

Los Angeles city agencies collectively store tens of millions of digital image files across dozens of disconnected databases — and a significant share of those files are duplicates, copies of copies churned out during three separate rounds of emergency digitisation that began after the 2018 Woolsey Fire exposed how fragile paper-based record-keeping really was. That realisation, and the chaotic digitisation scramble that followed, is the origin story of a problem the city's Information Technology Agency is now formally trying to solve.

The stakes are not abstract. With the 2028 Summer Olympics infrastructure buildout accelerating across venues from SoFi Stadium in Inglewood to the proposed athlete's village corridor near USC, city planners and permit offices are pulling environmental impact files, parcel maps, and engineering photographs at an unprecedented rate. Duplicate images slow retrieval, inflate storage costs, and, in the worst cases, push outdated versions of documents in front of decision-makers who have no way to know they're looking at superseded files.

Three Digitisation Waves, One Uncoordinated Mess

The first major push came in 2019, after the Mayor's Office of Resilience flagged that the Department of Building and Safety's paper permit archive in Van Nuys was vulnerable to both fire and flood. Contractors scanned roughly 4.2 million pages over eighteen months, uploading files to a server infrastructure that, at the time, had no deduplication protocol. A second wave followed in 2021 under a pandemic-era federal relief spending plan, when the Los Angeles County Metropolitan Transportation Authority ran its own parallel effort to digitise decades of rail and bus corridor planning photographs. Metro's project and the city's project used different file-naming conventions and different metadata standards. Neither talked to the other.

The third wave arrived in 2023 under Mayor Karen Bass's housing emergency declaration. Her administration pushed to digitise zoning records and parcel imagery across all fifteen City Council districts to accelerate permitting for emergency shelter and affordable housing construction. Speed was the priority. Quality control was not. According to a review published by the City Controller's Office in March 2025, the digitisation vendor hired under that program delivered files with an estimated duplication rate of between 18 and 23 percent across the housing-related image archive. That single contract, valued at approximately $6.7 million, is now cited internally as the clearest example of what happens when procurement moves faster than IT governance.

The Getty Conservation Institute, based on Sepulveda Boulevard in Brentwood, has worked alongside the Los Angeles Public Library's Central Branch downtown on archival standards for years, and staff from both institutions have informally flagged to city officials that the problem is not unique to LA — but that the scale here is unusually large given how many agencies operate semi-independently. The Los Angeles City Archives on Main Street in the Civic Center district, which manages the city's official historical record, has its own deduplication workflow but no authority to impose it on other departments.

What a Fix Actually Looks Like

The ITA issued a request for proposals in April 2026 seeking vendors capable of running automated hash-matching and perceptual-hashing tools across an estimated 80 terabytes of image data held by six pilot departments, including the Department of City Planning and the Bureau of Engineering. The contract ceiling in that RFP was listed at $2.1 million for the initial phase. Responses were due May 30, and the ITA confirmed it received seven bids, though award details have not yet been made public.

For residents and businesses dealing with permitting — particularly the contractors and architects working on the dense cluster of Olympic-related construction projects near the Exposition Park corridor — the practical effect of the cleanup will be faster document retrieval and fewer instances of inspectors or plan-checkers pulling the wrong revision of a site photograph. The ITA has said it expects the first pilot results by the end of the third quarter of 2026. Whether the fix holds long-term will depend on whether the city finally adopts a unified metadata standard before the next emergency forces another rushed digitisation sprint.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.