The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archive Problem: The Hidden Cost of Duplicate Images Clogging City Systems

From the Bureau of Engineering to the Olympic infrastructure database, redundant image files are quietly draining storage budgets and slowing down Los Angeles's most critical public tech projects.

By Los Angeles News Desk · Published 4 July 2026, 11:40 am

4 min read

Los Angeles city agencies collectively store an estimated tens of millions of digital image files across municipal servers, and a growing share of that data is duplicated — sometimes copied dozens of times across departments that never talk to each other. The problem has a name in IT circles: duplicate image proliferation. And in a city spending aggressively on digital infrastructure ahead of the 2028 Summer Olympics, the redundancy is costing real money.

The timing matters. The city's Bureau of Engineering is managing overlapping data pipelines for Olympic venue construction at sites including SoFi Stadium in Inglewood and the proposed Village infrastructure near UCLA in Westwood. Meanwhile, the Los Angeles Housing Department — already under pressure from Mayor Karen Bass's housing emergency declaration, which she extended in January 2025 — is digitizing property inspection records at a rate that has overwhelmed existing deduplication protocols, according to public contract filings reviewed by The Daily Los Angeles.

The Numbers Behind the Mess

Storage costs are the clearest signal. Enterprise cloud storage at the scale municipal governments operate runs between $0.02 and $0.08 per gigabyte per month depending on the tier. A single high-resolution site photograph taken during a Los Angeles Department of Building and Safety field inspection can exceed 8 megabytes. Multiply that by the thousands of inspections conducted monthly across neighborhoods from Boyle Heights to the San Fernando Valley, add three or four duplicate copies created each time a file is emailed, uploaded to a shared drive, and attached to a case management ticket, and the math turns ugly fast.

Industry benchmarks from the International Data Corporation suggest that between 20 and 30 percent of stored enterprise data is redundant, obsolete, or trivial. Applied to the city's known infrastructure, that range implies a meaningful slice of Los Angeles's IT storage budget — which city budget documents put at roughly $180 million for the 2025-26 fiscal year across the Information Technology Agency — is funding files that serve no operational purpose. The city's ITA has not published a standalone deduplication audit, but the issue surfaced in a 2024 performance review of the city's GeoHub platform, the open data portal run jointly with the Bureau of Engineering.

The GeoHub hosts aerial and street-level imagery used by urban planners, fire risk modelers, and homelessness outreach coordinators at organizations including the Los Angeles Homeless Services Authority. LAHSA's field teams rely on address-linked photographs to verify encampment locations from Skid Row to the LA River Greenway corridor. When the same address generates multiple image records with slightly different filenames — a common artifact of how different apps upload photos — analysts must manually reconcile them or risk acting on stale data.

What Deduplication Actually Fixes

Automated duplicate image replacement tools use perceptual hashing, a process that converts image content into a short numerical fingerprint and flags near-identical files even when filenames differ. Several vendors have pitched such tools to Los Angeles city departments in recent procurement cycles. A contract awarded in March 2026 to modernize document management at the Los Angeles County Registrar-Recorder's office included deduplication components, though the city and county operate separate systems and the benefit does not automatically transfer across jurisdictions.

For the 2028 Olympics push, the stakes are higher. The LA28 organizing committee is building a shared media asset library that will eventually interface with city permitting and traffic management databases. Duplicate images inside that system don't just waste storage — they create version-control failures that can delay permits or send construction crews to the wrong GPS coordinates.

Practical steps are already available to city departments without waiting for a new contract. Agencies can run open-source deduplication scans on existing file servers, establish a single canonical upload pathway for field inspectors, and require image metadata standards — including GPS tags and timestamps — before files enter any shared database. The Los Angeles Department of City Planning began piloting standardized metadata requirements for its community plan update documents in early 2026, a model other departments could replicate by the fourth quarter of this year.

The city has the tools. The question is whether department heads move before Olympic deadline pressure makes the cleanup far more expensive than the storage bill it was always supposed to prevent.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.