The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archive Bottleneck: The Numbers Behind the City's Duplicate Image Problem

Tens of thousands of redundant files are clogging municipal databases, costing taxpayer dollars and slowing the agencies racing to document everything from wildfire damage to Olympic construction.

By Los Angeles News Desk · Published 4 July 2026, 12:00 pm

3 min read

LA's Digital Archive Bottleneck: The Numbers Behind the City's Duplicate Image Problem
Photo: Rust, Sunchlar M. / Public domain (Wikimedia Commons)

Los Angeles city departments are sitting on an estimated backlog of duplicate digital images running into the hundreds of thousands of files, according to records management professionals familiar with municipal database audits. The redundancy problem isn't abstract — it carries a measurable dollar cost, slows records requests, and is complicating the documentation push tied to both the 2028 Olympics infrastructure buildout and Mayor Karen Bass's ongoing housing emergency response.

The timing matters because LA is generating more imagery than at almost any previous point in its history. The Bureau of Engineering is photographing construction progress along the Olympic corridor from Inglewood to downtown. The Los Angeles Homeless Services Authority is maintaining photo documentation for thousands of properties swept under the mayor's emergency housing directives. The Fire Department has been expanding its aerial and ground-level imaging of high-risk zones in the Santa Monica Mountains and the Sepulveda Pass since the January 2025 wildfires. All of that output lands in storage systems that, in several departments, lack automated deduplication tools.

What the Numbers Actually Show

A 2024 audit of city IT infrastructure — referenced in a Bureau of Contract Administration budget memo reviewed by The Daily Los Angeles — found that unstructured data storage costs for the city had risen roughly 18 percent year-over-year, with image and video files accounting for the largest share of that growth. Industry benchmarks from data management consultancies suggest that duplicate files typically represent between 20 and 40 percent of an unstructured data environment's total volume. Applied even conservatively to LA's holdings, that points to millions of gigabytes of redundant material spread across servers at City Hall East on Main Street, the Emergency Operations Center in Boyle Heights, and the Information Technology Agency's primary facility near Van Nuys.

Storage costs in enterprise environments currently run roughly $25 to $50 per terabyte per month for managed cloud tiers, depending on access frequency and vendor contracts. At those rates, a department sitting on 500 terabytes of unnecessary duplicate images could be burning anywhere from $12,500 to $25,000 per month — money that, in the current fiscal environment, the city's budget office has been under pressure to identify and claw back.

The problem is compounded by how images enter city systems. Field crews for the Los Angeles Department of Building and Safety upload inspection photos from mobile devices, often capturing the same structure multiple times across different visits without file-naming conventions that flag duplicates on ingestion. The same issue appears in the Recreation and Parks Department's documentation of facility conditions across its more than 600 parks, including high-traffic locations like Griffith Park and MacArthur Park in Westlake. Without a deduplication pass at the point of upload, files propagate quietly.

What Comes Next for City Systems

The Information Technology Agency has been piloting a deduplication workflow in partnership with at least one cloud vendor since early 2026, focusing initially on the Building and Safety inspection archive. The pilot is targeting a subset of roughly 2 million image files accumulated between 2019 and 2024, according to a project summary circulated to City Council members on the Ad Hoc Committee on the 2028 Olympic and Paralympic Games.

For departments not yet in the pilot, the practical advice from records managers is straightforward: enforce hash-based duplicate detection at the point of ingestion rather than relying on periodic manual review. That means updating upload portals — many of which were last substantially redesigned before 2020 — to reject or flag files whose checksums match existing records.

The stakes grow as the Olympic clock ticks. By July 2028, city agencies will need to produce clean, auditable image records covering two-plus years of venue construction, transportation corridor upgrades, and neighborhood-level impact documentation. A cluttered archive full of duplicate files doesn't just waste storage dollars — it slows the response to public records requests and creates legal exposure when evidentiary images are needed quickly. Cleaning that up now, while the volume is still manageable, is considerably cheaper than doing it under deadline pressure two years from now.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.