The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archives Are Riddled With Duplicate Images — and the Numbers Tell a Costly Story

City agencies and cultural institutions are sitting on terabytes of redundant photo files, and the bill for storing them keeps climbing.

By Los Angeles News Desk · Published 4 July 2026, 11:36 am

4 min read

LA's Digital Archives Are Riddled With Duplicate Images — and the Numbers Tell a Costly Story
Photo: Photo by Ekaterina Belinskaya on Pexels

Los Angeles city departments and major cultural repositories are collectively managing hundreds of thousands of duplicate digital images across their archives — files that consume expensive server space, slow down public-facing search tools, and complicate compliance with California's public records laws. The problem is not new, but the data emerging from several recent internal audits makes the scale hard to ignore.

The timing matters because 2028 Olympic planning has pushed the city into an aggressive digital documentation push. The Los Angeles 2028 organizing committee, working alongside the City Administrative Office, is generating thousands of new photographic assets every month tied to venue construction progress at sites including SoFi Stadium in Inglewood and the Los Angeles Memorial Coliseum. Without systematic duplicate-removal protocols in place before that pipeline accelerates, archivists warn the redundancy problem will compound sharply over the next 24 months.

What the Numbers Actually Show

Digital asset management specialists who work with municipal clients — without speaking for any single named agency — describe a consistent pattern in large urban archives: between 20 and 35 percent of stored image files in unmanaged repositories are exact or near-exact duplicates, based on industry benchmarks published by the Digital Preservation Coalition. For an institution storing 500,000 images, that translates to as many as 175,000 redundant files.

Cold cloud storage for image archives typically runs between $0.004 and $0.023 per gigabyte per month depending on the provider tier — figures published by major cloud vendors including Amazon Web Services and Google Cloud in their publicly available pricing tables. A single uncompressed RAW photograph from a modern DSLR camera can exceed 40 megabytes. At those rates, 175,000 duplicate images averaging 25 megabytes each represent roughly 4.4 terabytes of redundant data and a recurring monthly storage cost in the hundreds of dollars — modest per line item, but multiplied across dozens of city departments the figure becomes meaningful at budget review time.

The Los Angeles Public Library, which maintains the historic Security Pacific National Bank Photo Collection and the Los Angeles Daily News Negatives archive at its Central Library branch on West Fifth Street in Downtown, digitized more than 2.5 million images over the past decade. Staff there have used a combination of hash-based deduplication software and manual review to manage the collection, but the library has not publicly released figures on what percentage of its digital holdings were identified as duplicates during that process.

Local Programs Wrestling With the Problem

The Getty Center in Brentwood and the UCLA Library Special Collections in Westwood are among the larger Los Angeles institutions with formal digital asset management frameworks, though their systems use different metadata schemas, making cross-institutional deduplication essentially impossible without significant coordination. The Los Angeles County Department of Arts and Culture administers a separate grant program — the Arts Ed Collective — that funds digitization projects at smaller neighborhood organizations, many of which have no deduplication workflow at all.

For city agencies specifically, the issue intersects with Mayor Karen Bass's ongoing housing emergency declaration. The Emergency Housing Relief program has generated a substantial documentary record, including site photography from interim housing installations across neighborhoods from Boyle Heights to Sun Valley. Those images flow into multiple departmental systems — the Housing Department, the Emergency Management Department, and the Mayor's Communications Office — with no unified deduplication checkpoint between them.

The practical consequence for taxpayers is not just storage cost. When investigators or journalists submit California Public Records Act requests for photographic documentation, city staff must manually sort through redundant files to produce responsive records — a time cost that shows up in staff hours billed against department budgets.

The clearest near-term step available to city administrators is deploying perceptual hashing tools — software that identifies visually similar images even when file names and metadata differ — across shared network drives before the Olympic documentation volume ramps up. Several open-source options exist, and at least two major vendors offered municipal licensing to California cities as of early 2026. Establishing a single master asset repository tied to the city's existing GIS infrastructure on Figueroa Street would give archivists, planners, and public records officers a common reference point — and stop the duplicate count from doubling again before the torch is lit.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.