Los Angeles city and county databases used to track homeless individuals, housing applicants, and social services clients contain a significant but unquantified volume of duplicate image records — the same person photographed multiple times under different case IDs — a problem that auditors and data managers say is distorting the headline figures that drive policy decisions and budget allocations across the region.
The issue matters right now because Mayor Karen Bass's Inside Safe program, launched under her December 2022 housing emergency declaration, relies on case management software that pulls client photographs as a primary identity anchor. When the same unhoused person is processed at separate intake sites — say, a LAHSA outreach team in Skid Row one week and a PATH Ventures facility in Hollywood the next — they can generate two or more distinct records, each carrying a unique image file. That inflation feeds directly into the dashboards policymakers use to justify funding requests to the Los Angeles County Board of Supervisors.
How Bad Is the Duplication Problem?
Precise figures for Los Angeles are not publicly available, but the problem has measurable precedent in comparable urban systems. The U.S. Department of Housing and Urban Development's 2024 Continuum of Care guidance flagged duplicate client entries as a systemic weakness in Homeless Management Information Systems nationwide, noting that some jurisdictions had seen double-counting rates ranging from 8 percent to as high as 22 percent of active records before deduplication protocols were applied. Los Angeles County operates one of the largest such systems in the country, covering roughly 75 participating agencies.
At the Los Angeles Homeless Services Authority, which coordinates data across those agencies, the annual Point-in-Time count has historically been criticized by researchers at USC's Homelessness Policy Research Institute for undercounting, but the flip side — overcounting through duplicate entries — receives far less public attention. The 2025 PIT count, released in spring of that year, placed the county's homeless population at approximately 75,312 people. If even a conservative 10 percent duplication rate applied to active case records, that would mean program officers could be managing upward of 7,500 phantom files, each consuming staff time, storage costs, and potential service allocations.
The image duplication layer adds a specific technical cost. Each intake photograph captured by an outreach worker's tablet or by a fixed camera at a site like the Downtown Women's Center on San Pedro Street or the Union Rescue Mission on East 5th Street is typically stored as a JPEG between 1.5 and 4 megabytes. Multiply even 5,000 duplicate image records at an average 2.5 MB each and you get roughly 12.5 gigabytes of redundant data sitting in servers that the city's Information Technology Agency spends real money to maintain and back up. At commercial cloud storage rates hovering around $0.023 per gigabyte per month in mid-2026, the raw storage cost is trivial — but the staff hours required to audit, flag, and merge those records are not.
What Cleanup Looks Like — and Why It's Stalled
The Los Angeles County Department of Social Services piloted a deduplication initiative called Project CLEAR in 2023, targeting its CalWORKs and General Relief caseloads in the Boyle Heights and South Los Angeles service areas. That effort identified and merged roughly 4,200 duplicate client records over a six-month window, freeing an estimated 1,100 hours of caseworker time that had been spent reconciling conflicting records. The same methodology has not yet been applied at scale to the HMIS network that LAHSA manages.
Data managers at several nonprofits operating along the Westside — organizations working in Venice and Santa Monica who asked not to be named because they are not authorized to speak publicly about internal systems — have described the problem as persistent and growing as intake volume increases ahead of 2028 Olympic preparations, which are pushing new infrastructure projects and displacement pressures into neighborhoods that already have high concentrations of unhoused residents.
The practical fix exists: facial-recognition-assisted deduplication, combined with manual review, has been used by Allegheny County in Pennsylvania and by New York City's Department of Homeless Services to cut duplicate image rates by more than 60 percent within 12 months of deployment. For Los Angeles, the next decision point is the city's fiscal year 2026-27 ITA budget, which goes before the City Council's Budget and Finance Committee this fall. Advocates say that without a funded deduplication mandate written into that budget, the phantom records will keep accumulating — and the numbers driving the city's most expensive social programs will remain, at least in part, fiction.