How synthetic identity fraud evades detection

What makes this type of fraud a challenge to detect

Matthew P. Miller

Matthew P. Miller

Principal, Advisory, Cyber Security Services, KPMG US


Ryan Budnik

Ryan Budnik

Manager Advisory, Cyber Security Services, KPMG US

+1 512-320-5200

Sophia Chen

Sophia Chen

Associate Advisory, Cyber Security Services, KPMG US

+1 949-885-5511

 In the first installment of this series, we detailed what synthetic identity fraud is and why we should care about it. As a quick summary, synthetic identity fraud occurs when cybercriminals create identities using a combination of real and fake information. These identities are commonly used to obtain loans without the intention of paying lenders back.

Its difficulty to detect has quickly established synthetic identity fraud as a leading crime, costing banks more than $6 billion total or $15,000 per occurrence in damages.1 In this article, we’ll discuss what makes this type of fraud a challenge to detect.

When you apply for credit, a few key pieces of information are required. These include your Social Security number (SSN), name, date of birth, and address, among others. With this information, lenders can pull a credit history to assess your risk and ultimately determine whether they’ll grant you a line of credit. In some instances, individuals may have limited credit history or are rebuilding their credit—fraudulent applications frequently resemble these situations.

But what about verifying that the SSN provided matches the name, address, date of birth, and other personally identifiable information (PII) provided? We would think that this process is straightforward. However, there isn’t an efficient method to electronically validate that the PII provided is associated with that particular SSN. The current method is too time-consuming when verdicts on applications are often provided within minutes of submission.

Furthermore, in 2011, the Social Security Administration began assigning random SSNs to “protect the integrity of SSNs and to extend the pool of nine-digit SSNs available nationwide.”2 This is in contrast to when SSNs were issued following a clearly defined numbering scheme. The nine-digit number was composed of three parts: the Area Number, the Group Number, and the Serial Number.3 Under this scheme, we could check whether the address, for example, is plausible based on the SSN provided as the Area Number is assigned based on geographical region. This type of analysis is not possible with the new randomization strategy.

Instead of assessing applications holistically, institutions may rely on validating each piece of information individually—this means that if the name, address, date of birth, and SSN are valid on their own, banks are none the wiser.

Equipped with this background, we may now ask ourselves: What can we do to detect synthetic identity fraud? The next installment of this series aims to answer this question, exploring how artificial intelligence and machine learning can be applied to combat this type of fraud.


  1. Synthetic Identify Fraud in the U.S. Payment System: A Review of Causes and Contributing Factors, The Federal Reserve, July 2019.
  2. Ibid.
  3. Social Security Numbers, Social Security Administration, 2021.