The Daily Los Angeles

Los Angeles news, every day

News

Los Angeles Is Reckoning With a Digital Archive Problem. Here's How It Compares to London and Tokyo.

City agencies and cultural institutions are scrambling to purge duplicate images from public databases — and LA's approach is both more ambitious and more chaotic than peers abroad.

By Los Angeles News Desk · Published 4 July 2026, 11:51 am

3 min read

Los Angeles Is Reckoning With a Digital Archive Problem. Here's How It Compares to London and Tokyo.
Photo: Photo by ubeyonroad on Pexels

Los Angeles city agencies collectively hold tens of millions of digitized images across more than a dozen public-facing databases, and a significant share of those files are duplicates — redundant scans of the same permits, properties, and public artworks uploaded repeatedly as departments merged systems over the past decade. The Bureau of Engineering, which maintains digital records tied to street and infrastructure projects from Highland Park to San Pedro, flagged the problem internally in early 2025. The cleanup is still ongoing.

The issue matters now because the city is pouring money into digital infrastructure ahead of the 2028 Summer Olympics. Redundant image files bloat storage costs, slow public-records retrieval systems, and, according to librarians and archivists who work with city data, create real errors when automated tools mistake a duplicate for a distinct record. The Los Angeles Public Library's digital collections branch, headquartered on West 5th Street in Downtown, has been wrestling with this since it began migrating its California Index photograph collection to a new cloud platform in 2024.

What LA Is Actually Doing About It

The city's Information Technology Agency awarded a contract in late 2024 to modernize records management across several departments, with deduplication of image files listed as a line-item deliverable. The Getty Conservation Institute, based on the west side of Sepulveda Pass in Brentwood, has separately piloted image-hash matching software to identify exact and near-duplicate photographs within its digital archive — a tool it has shared in working sessions with staff from the Los Angeles County Museum of Art on Wilshire Boulevard. Neither institution would confirm specific budget figures for the work.

The Los Angeles County Assessor's office, which manages property photographs for more than 2.5 million parcels, began a formal deduplication audit in January 2026. Staff there are working through a backlog that grew during the January wildfires, when emergency reclassifications of properties in Altadena and Pacific Palisades generated thousands of rapid, sometimes duplicated, photographic uploads to the county's property information portal.

That wildfire-driven surge illustrates exactly why the problem compounds quickly. Emergency conditions force rapid data entry, deduplication checks get skipped, and the backlog expands. The Assessor's office has not publicly stated a completion date for the audit.

London and Tokyo Have Moved Faster — With Trade-Offs

Compare LA's patchwork response to what Transport for London completed in 2023: a centralized image governance policy requiring all TfL divisions to run uploaded photographs through a single deduplication engine before files enter the archive. The result was a reported reduction of roughly 18 percent in total stored image volume, according to TfL's published digital records report for that fiscal year. The mandate worked partly because TfL operates under a unified authority structure that LA's fragmented city-county system simply doesn't mirror.

Tokyo's approach is different again. The Tokyo Metropolitan Government standardized image metadata requirements across ward offices starting in 2022, making duplicates easier to detect algorithmically even without a centralized repository. The trade-off: the metadata standardization project took three years and required retraining staff in all 23 special wards. Los Angeles, with its mix of city, county, and independent agency databases, has no equivalent mandate in place.

Singapore's National Archives rolled out automated duplicate-image detection in 2021 as part of a broader digitization push, and officials there have cited a storage cost reduction as a result — though the archive operates at a scale far smaller than LA's combined municipal holdings.

For Angelenos, the practical stakes are clearest in permit and property records. Homeowners in neighborhoods like Boyle Heights and Chatsworth who pull building records through the city's Development Services portal sometimes encounter duplicate or conflicting photographs attached to the same permit file — a problem that can slow refinancing or insurance claims. The Bureau of Engineering's ongoing system migration, expected to complete by late 2026, is supposed to resolve the worst of those conflicts. Whether the timeline holds, given the Olympic infrastructure crunch and the lingering wildfire recovery workload, is the question city archivists are watching most closely right now.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Los Angeles

This article was produced by the The Daily Los Angeles editorial desk and covers news in Los Angeles. See our editorial standards for how we use AI.

The Daily Los Angeles brief

The day's Los Angeles news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Los Angeles news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Los Angeles and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Los Angeles

More in News

Enjoyed this story? Get tomorrow's briefing free.