The Daily Los Angeles

Los Angeles news, every day

News

L.A.'s Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damaging Story

City agencies, museums, and entertainment studios across Los Angeles are spending millions each year managing redundant digital files, and a wave of new audits is finally putting hard figures to the problem.

By Los Angeles News Desk · Published 4 July 2026, 11:28 am

3 min read

Los Angeles city departments collectively store an estimated 40 percent of their digital image libraries as exact or near-exact duplicates, according to internal audits reviewed by municipal IT contractors working under the city's Digital Infrastructure Modernization Initiative. That redundancy is costing storage budgets across departments — from the Bureau of Engineering to the Department of Cultural Affairs — real money, at a time when every line item faces scrutiny ahead of the 2028 Olympic Games infrastructure push.

The issue matters right now because Los Angeles is in the middle of a sweeping digitization effort. The city committed in early 2025 to move the majority of its departmental records onto centralized cloud infrastructure by the end of fiscal year 2027. As agencies have migrated decades of photographic files — aerial survey images from the Bureau of Engineering's offices near Figueroa Street, permit photos logged by Planning Department field inspectors, archival images from the Los Angeles Public Library's Central branch on Fifth Street downtown — duplicate files have followed those migrations, quietly inflating storage costs and slowing retrieval systems.

What Duplication Actually Costs

Cloud storage is not free. Enterprise-tier storage contracts negotiated by the city's Information Technology Agency run roughly $0.023 per gigabyte per month for standard archival tiers, based on publicly available pricing structures for comparable municipal contracts in comparable U.S. cities. A single high-resolution survey photograph from a drone flyover of the Sepulveda Basin can run 80 megabytes or more. Multiply that by tens of thousands of redundant files across a department, and the monthly drag on a single agency's storage budget reaches into the tens of thousands of dollars annually — before you account for the labor cost of metadata staff who must manually tag, sort, and reconcile duplicate records.

The Los Angeles County Museum of Art on Wilshire Boulevard faced a version of this problem during its own digitization push between 2022 and 2024. The museum's digital collections team identified more than 28,000 duplicate image records across its internal asset management system during that two-year audit window — a figure the institution disclosed in its 2024 annual report. The cleanup process required six months of dedicated staff time and a $180,000 contract with a digital asset management vendor to implement automated hash-matching tools that compare image files at the binary level rather than relying on filename or metadata alone.

The entertainment industry, which remains one of Los Angeles's largest economic engines despite AI-driven disruption reshaping production pipelines, has long grappled with this issue at scale. Post-production houses clustered in Burbank and the Sunset Strip corridor routinely deal with duplicate frame exports, reference images, and stills packages generated across distributed production teams. A 2025 survey by the Entertainment Technology Center at USC found that mid-size production companies reported spending an average of 11 percent of their digital asset management budgets on storage costs attributable to redundant files — a figure that has climbed as remote collaboration across multiple time zones became standard practice after 2020.

What Agencies Are Doing About It

The city's Information Technology Agency began piloting a duplicate-detection protocol in January 2026 across three departments: the Department of Building and Safety, the Bureau of Sanitation, and the Los Angeles Public Library system. The pilot uses perceptual hashing — a technique that generates a compact digital fingerprint for each image, allowing near-identical images to be flagged even when file names or formats differ. Early results from the Building and Safety pilot, covering roughly 2.1 million stored images, identified approximately 630,000 files — about 30 percent of the tested library — as candidates for deletion or consolidation.

For residents and city stakeholders, the practical consequence of cleaning up these libraries is faster service. When a homeowner in Silver Lake files a permit application that triggers an inspection photo record, retrieval times for that file drop significantly when the underlying database isn't sorting through six copies of the same image to return one result. The city's ITA has set a target of completing department-wide duplicate audits across all major agencies by March 2027 — giving administrators roughly a year of clean data before Olympic-related infrastructure projects begin generating their own flood of photographic documentation.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.