The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damning Story

City agencies and cultural institutions across Los Angeles are burning through storage budgets and staff hours wrestling with redundant image files, and a new wave of audits is finally putting hard figures to the problem.

By Los Angeles News Desk · Published 4 July 2026, 11:48 am

3 min read

LA's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Damning Story
Photo: Photo by Stephen Leonardi on Pexels

Los Angeles city agencies collectively hold tens of millions of digital image files across their servers, and a significant share of them are exact or near-exact duplicates — costing money, slowing workflows, and in some cases compromising public records integrity. That much is now clear from internal audits conducted this year at several municipal departments, though the full scope of the redundancy problem has rarely been quantified for the public.

The timing matters. With the 2028 Olympics infrastructure push generating massive volumes of construction documentation photography, and the Los Angeles Housing Department running parallel imaging campaigns to document properties under Mayor Karen Bass's housing emergency declaration, the city's data storage demands are accelerating fast. Duplicate image replacement — the process of identifying redundant files, flagging them, and either deleting or consolidating them — has moved from a back-office IT nuisance to a line-item budget concern.

Where the Redundancy Lives

The Los Angeles Department of City Planning, headquartered at 200 N. Spring Street in Downtown, manages one of the largest municipal image repositories in California. Planning departments routinely accumulate duplicate files when multiple staff members photograph the same site visit or when scanned permit documents are uploaded through more than one workflow channel. Industry benchmarks from digital asset management firms suggest that enterprise-scale image libraries typically carry between 20 and 40 percent redundancy by file count — meaning for every ten images stored, two to four are functionally identical or near-identical to another file already in the system.

At the Los Angeles County Museum of Art on Wilshire Boulevard, a digitization initiative launched in 2023 to catalog over 142,000 collection objects ran directly into the duplicate problem. LACMA's digital collections team has publicly described the challenge of images captured under different lighting conditions or at varying resolutions being stored as separate files, inflating the working library and making retrieval slower. Storage costs for large cultural institutions can run $15,000 to $50,000 annually per petabyte depending on redundancy levels and backup tier — costs that compound when duplicates are never purged.

The Los Angeles Public Library system, which operates 73 branch locations and a major digital archive program through its Central Library on West Fifth Street, began a phased deduplication project in fiscal year 2025-26. Librarians working with the California Digital Library consortium have found that photographic collections digitized before 2018 — when metadata standards were less consistent — show the highest rates of duplication, sometimes exceeding 35 percent of total file counts in specific historical collections.

The Cost of Inaction

Storage isn't free. Cloud storage at commercial rates for government and institutional clients typically runs between $0.02 and $0.05 per gigabyte per month, depending on contract tier and redundancy protections. A municipal department sitting on 500 terabytes of image data — not unusual for a large planning or infrastructure agency — could be paying $10,000 to $25,000 monthly just for storage, with a meaningful fraction of that covering files that add no informational value.

Deduplication software licensing costs have dropped sharply. Enterprise tools that would have run $80,000 or more annually five years ago now start around $12,000 per year for mid-sized deployments, according to published pricing from vendors including Hammerspace and Komprise. The return on investment case is increasingly straightforward: a one-time audit and cleanup that eliminates 25 percent of stored image volume can pay for itself within a single budget cycle.

For LA's 2028 preparations specifically, the stakes are higher still. The LA28 organizing committee and city infrastructure teams are generating construction documentation at Exposition Park, SoFi Stadium in Inglewood, and the Los Angeles Memorial Coliseum at volumes that will stress any unmanaged archive. Getting deduplication workflows established now, before those libraries grow by another order of magnitude, is the practical path forward.

Agencies that haven't yet started should request a baseline audit from their IT departments before the end of the current fiscal year on September 30. The audit itself costs little; the alternative — paying indefinitely to store the same photograph twice — costs a great deal more.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.