The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archives Are Drowning in Duplicate Images — And the Numbers Tell a Costly Story

From city hall records to LAPD evidence files, redundant image data is quietly consuming millions in storage budgets across Los Angeles county systems.

By Los Angeles News Desk · Published 4 July 2026, 11:57 am

3 min read

LA's Digital Archives Are Drowning in Duplicate Images — And the Numbers Tell a Costly Story
Photo: Photo by Ant Armada on Pexels

Los Angeles city and county agencies collectively maintain tens of millions of digital image files across their records systems, and a growing share of that storage is being eaten up by exact or near-exact duplicates — a problem that IT administrators across the region say has reached a tipping point in fiscal year 2025-26.

The timing matters. With Mayor Karen Bass's housing emergency still driving rapid document processing at the Los Angeles Housing Department on Figueroa Street, and with 2028 Olympics infrastructure contracts generating thousands of new permitting photos weekly, the volume of images flowing into public databases has accelerated faster than legacy deduplication protocols can handle. Every redundant file stored is a dollar spent twice.

What the Numbers Actually Show

Industry benchmarks for large municipal governments suggest that between 20 and 40 percent of stored image data in unmanaged archives is duplicate or near-duplicate content, according to analyses published by the Storage Networking Industry Association. For a city the size of Los Angeles — which operates one of the largest municipal IT infrastructures in the United States — even the low end of that range translates to a substantial financial hit.

The Los Angeles County Department of Public Works alone processed more than 1.2 million digital permit and inspection photographs in fiscal year 2024, according to figures the department published in its annual performance report. If roughly a quarter of those files were redundant duplicates, the excess storage cost at current commercial cloud rates of approximately $0.023 per gigabyte per month would run into hundreds of thousands of dollars annually — and that is just one department.

The LAPD's Digital Evidence Management System, housed at the Evidence Control Section facility in downtown Los Angeles, faces a compounded version of the problem. Body-worn camera footage and crime scene photography are often uploaded multiple times by different officers attending the same incident, creating redundant copies that must legally be retained for set periods. A 2024 audit of body-worn camera programs in major U.S. cities, published by the Police Executive Research Forum, found that duplicate uploads accounted for an average of 17 percent of total stored data volume across surveyed departments.

The Fix Is Faster Than the Budget Cycle

Several Los Angeles agencies have already begun piloting deduplication tools. The Los Angeles County Metropolitan Transportation Authority launched a digital asset audit in January 2026 covering its construction documentation archives for the Crenshaw/LAX line extension and the East San Fernando Valley Transit Corridor project. Metro's goal, outlined in its FY2026 technology roadmap, is to reduce redundant stored data by 30 percent before the end of the calendar year.

The Los Angeles City Clerk's Office, which digitizes council meeting records and hearing photographs at its Spring Street offices, began deploying hash-based deduplication software in March 2026. Hash-based systems assign a unique fingerprint to each image file and automatically flag copies before they are written to long-term storage — catching the problem at the point of entry rather than after years of accumulation.

For smaller agencies and nonprofit organizations working within the LA ecosystem — including groups like the Los Angeles Conservancy, which maintains photographic archives of historic structures throughout neighborhoods from Boyle Heights to Hancock Park — the practical advice is straightforward. Running a free or low-cost deduplication pass on existing archives before migrating to new cloud storage contracts can eliminate months of unnecessary billing. Tools like dupeGuru and open-source libraries built on perceptual hashing are widely available and require no enterprise licensing.

The broader push comes as Los Angeles prepares for an unprecedented increase in official photography tied to the 2028 Games. Venue construction documentation, athlete credentialing images, and public safety surveillance footage will all flow into city and county systems at volumes that will stress any archive without a deduplication layer already in place. Agencies that have not audited their image storage by late 2027 are likely to find themselves racing to fix a problem that has, by then, become significantly more expensive to solve.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.