The Daily Los Angeles

Los Angeles news, every day

News

L.A. Is Digitizing Its Visual Archives — and Fighting a Plague of Duplicate Images Along the Way

As cities worldwide race to modernize public records and cultural collections, Los Angeles is grappling with a messy, expensive problem buried inside its own digital infrastructure.

By Los Angeles News Desk · Published 4 July 2026, 11:47 am

4 min read

L.A. Is Digitizing Its Visual Archives — and Fighting a Plague of Duplicate Images Along the Way
Photo: Photo by Darya Sannikova on Pexels

Los Angeles has spent millions converting decades of paper permits, planning documents, and public art records into digital form — but a growing headache has emerged inside those databases: thousands of duplicate images clogging the city's archival systems, slowing retrieval times, and inflating storage costs at a moment when municipal budgets are already stretched thin.

The problem is not cosmetic. City departments managing everything from building inspection photos in Boyle Heights to cultural asset records held by the Los Angeles County Museum of Art's digital collections arm are dealing with redundant image files that were uploaded multiple times across incompatible legacy platforms. Technologists working with the city say the duplication problem compounds every time a new system is layered on top of an old one — which, in Los Angeles, has happened repeatedly since the early 2000s.

The urgency is sharpened by the 2028 Olympics. The city is under pressure to present a coherent, searchable public-facing digital infrastructure by the time athletes and visitors arrive. Duplicate image records inside venue planning databases and public-space documentation systems are a known obstacle to that goal.

What L.A. Is Actually Doing About It

The Bureau of Engineering, which manages infrastructure documentation for projects across the city, has been piloting a deduplication protocol since early 2026, targeting image libraries tied to streetscape and public works projects along the Crenshaw Corridor and in the Olympic Boulevard construction zone near USC. The protocol uses perceptual hashing — a technique that identifies visually similar images even when file names differ — to flag redundant uploads before they're permanently indexed.

The Los Angeles Public Library's digital collections division, based out of the Central Library on West 5th Street, has separately contracted with a records management vendor to audit its Historic Photographs Collection, which contains more than 750,000 digitized images. Staff identified a significant portion of that archive as containing near-duplicate or exact-duplicate entries, according to a 2025 internal review summary made available to The Daily Los Angeles. The library declined to provide a precise figure for how many duplicates were found, citing the ongoing nature of the audit.

The Getty Research Institute in Brentwood, which maintains one of the largest art-documentation digital repositories in the western United States, completed a similar deduplication sweep across its online collections in 2024. The Getty's approach, which relied on AI-assisted image clustering, has since been referenced by at least three other major U.S. cultural institutions as a model worth adapting.

How Other Cities Are Handling the Same Problem

London's Victoria and Albert Museum completed a system-wide image deduplication project across its 1.2 million-object digital catalogue in late 2024, reducing redundant files by roughly 18 percent and cutting cloud storage costs by an estimated £340,000 annually, according to figures the museum published in its 2024-25 annual report. Amsterdam's Rijksmuseum, which opened its Rijksstudio platform to the public in 2013, built deduplication logic directly into its upload architecture from the start — a structural decision Los Angeles's older agencies didn't have the benefit of making.

New York City's Department of Records and Information Services, which manages millions of municipal photographs at its archives in lower Manhattan, is dealing with a version of the same problem. The city allocated $2.1 million in its fiscal year 2026 budget for records modernization, a portion of which covers image deduplication work, according to the New York City Office of Management and Budget's published spending plan.

Los Angeles has not published a comparable standalone budget line for image deduplication work. Costs are distributed across individual departmental IT budgets, making a citywide figure difficult to establish from public records alone.

For residents and researchers, the practical effect shows up in search results. A query for construction photos from the Metro K Line expansion, for example, can return multiple versions of the same image with different file names — a frustration that slows down journalists, historians, and planners alike.

City technology officials have said publicly that a unified digital asset management strategy is part of the longer-term ITA roadmap, though no specific implementation date has been announced. For departments like the Bureau of Engineering and the Public Library, the work is happening now, file by file, ahead of whatever unified system eventually arrives.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.