The Daily Los Angeles

Los Angeles news, every day

News

LA's Digital Archive Crisis: The Key Decisions Ahead on Duplicate Image Replacement

City agencies, cultural institutions, and Olympic planners are confronting a costly, slow-moving reckoning over how to manage tens of millions of redundant digital images clogging public servers.

By Los Angeles News Desk · Published 4 July 2026, 11:28 am

3 min read

LA's Digital Archive Crisis: The Key Decisions Ahead on Duplicate Image Replacement
Photo: Newmark, Harris, 1834-1916 Newmark, Maurice Harris, 1859- Newmark, Marco Ross, 1878 Worden, J. Perry, 1866- / Public domain (Wikimedia Commons)

Los Angeles is sitting on a digital storage problem that nobody wants to own. Across city departments, public libraries, and cultural institutions stretching from Boyle Heights to Brentwood, duplicate images have accumulated for years inside government servers — redundant photos, scanned documents, and archival assets that drain storage budgets, slow retrieval systems, and make public records requests slower and more expensive to fulfill.

The issue has sharpened in 2026 because the clock is ticking. With the 2028 Olympic Games less than two years out, agencies including the Los Angeles Department of Transportation and the Bureau of Engineering are under pressure to modernize data infrastructure to support real-time event coordination and media operations. Duplicate image files — some estimates from municipal IT circles place redundancy rates above 30 percent in older departmental systems — represent both a cost and a liability when response times matter.

Where the Backlog Lives

The Los Angeles Public Library system, which operates 73 branch locations and maintains the digitized California Historical Society Photo Collection through its Central Library on Fifth Street downtown, is one of the most visible pressure points. Librarians and archivists have long flagged that digitization drives conducted between 2015 and 2022 produced large volumes of near-duplicate scans — slightly different exposures or resolutions of the same physical image — without a consistent protocol for selecting a canonical version and retiring the rest.

The Getty Research Institute in Brentwood faces a parallel challenge at a larger scale. Its digital asset management system holds hundreds of thousands of images from international acquisitions, and staff have described publicly, in published institutional reports, the difficulty of implementing deduplication without disrupting metadata chains that tie images to provenance records, rights clearances, and loan agreements.

At the municipal level, the Mayor's Office of Innovation — operating under Karen Bass's housing emergency framework, which has already stretched city IT resources — has not yet issued a unified citywide standard for image deduplication. That gap matters. Without a shared protocol, individual departments make incompatible choices, and the problem compounds.

The Decisions That Will Define the Next 18 Months

Three forks in the road are coming fast. First, the city must decide whether to contract a single enterprise deduplication platform or allow departments to procure their own tools. Enterprise licensing for image management software at municipal scale typically runs between $400,000 and $1.2 million annually depending on storage volume — a range that matters when the city's technology budget is already absorbing costs from Olympic infrastructure upgrades along the Crenshaw/LAX Metro line corridor.

Second, cultural institutions need to settle on a hash-based versus perceptual matching approach. Hash-based deduplication identifies exact binary duplicates — fast and cheap, but it misses near-duplicates. Perceptual matching catches visually similar images even when file properties differ, which is critical for archives but computationally heavier and more expensive. The Los Angeles County Museum of Art, which announced a major digital collections expansion in late 2024, is understood to be evaluating perceptual tools, though no contract has been publicly awarded.

Third — and most consequential — is the question of human review. Automated deduplication tools can flag candidates for deletion, but archivists and legal staff must still sign off before any image tied to a public record or rights agreement is retired. That labor cost is real. At current rates for certified archivists in Los Angeles County, sustained review work runs roughly $75 to $95 per hour, and a backlog of even 500,000 flagged image pairs represents a significant staffing commitment.

The practical path forward requires the Mayor's Office of Innovation to publish written guidance before the end of the third quarter of 2026 — ideally a tiered framework that lets small branch libraries use lightweight hash tools while larger institutions with complex provenance requirements adopt perceptual systems under separate funding. Institutions waiting for that guidance should document their current redundancy rates now, building the evidence base for budget requests. Departments that skip that step going into 2027 planning cycles will find themselves arguing for resources without the numbers to back them up.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.