Why don't we just block the fraudster's IP address and be done with it?

Why don't we just block the fraudster's IP address and be done with it?

06.7.2023

One of the common practices in fraud mitigation I have observed is leveraging IP addresses. Whether it was blocking the transactions linked to high-risk countries or just mitigating an attack from a particular IP address. Unfortunately, not every use case I saw was considering the limitations of IP addresses. To utilize IP addresses to the fullest, it is good to have a certain level of technical networking knowledge.

IP address

There are two different versions of IP protocols - IPv4 and IPv6 (there was also IPv5 which is considered experimental). IPv4 has been around since the early 1980s and is the most commonly used protocol and also a communication foundation of today's internet. IPv4 is connectionless which means when two devices want to communicate via IP they don't need to establish a connection or session (unlike TCP). Each of these two devices has to be assigned a unique IP address though. When we refer to an IP address in our discussion we usually mean an IPv4 address.

The IP address in IPv4 is a 32-bit number that is commonly represented in dotted notation separating 4 numeric values ranging from 0 to 255 (an octet). All the addressable IP addresses, therefore technically range from 0.0.0.0 to 255.255.255.255. IP address [217.165.217.165] can be written in multiple ways - replacing the decimal octets with hexadecimal (d9.a5.d9.a5) or as an integer number without dots (3651525029).

Mathematically, IPv4 can address ≈4.3 Billion devices in the network. Practically, there are certain ranges that have been allocated for a specific purpose. Below are the most common ones you might have encountered. Loopback range, from which we commonly use address 127.0.0.1 and which represents localhost or the machine we are actually sitting behind. Another 3 ranges are also commonly known (esp. the last one 192.168.xxx.xxx). We often use them when we configure our local home networks. There are few more like multicast addresses or reserved addresses which reduce the number of addressable IPs to less than 4.3B.

Loopback range:

127.0.0.0 - 127.255.255.255

Private ranges:

10.0.0.0 – 10.255.255.255
172.16.0.0 – 172.31.255.255
192.168.0.0 – 192.168.255.255

Though the number - 4.3B - might feel like quite a lot, when you ask Google how many computers there are it will come up with number ≈2B in 2019 including laptops and servers. But don't forget that one server can have multiple network cards and connections, and the same applies to a common laptop which usually has one ethernet as well as one wifi Network Interface Controller (NIC).

The number of mobile phones is a much bigger number. There were ≈15B devices in 2021. Even if we agree that not all of them are connected to the internet, it's more than clear that IPv4 with its 4.3B addresses is simply not enough. And though the above numbers are high, we have to consider the Internet of Things (web cameras, Alexas, smart appliances, etc.) as well as Industrial IoT.

So what magic is it that we still use IPv4 with so many devices connected to the internet? There were multiple technologies developed and deployed, but 2 of them link to our topics (we are still trying to stay in the fraud prevention domain):

Network Address Translation (NAT)
IPv6

If you are familiar with them or want to skip the details about NAT and IPv6, you can skip the section starting with Extra: and continue reading below!

Extra: Network Address Translation (NAT)

If 4.3B devices is a limit for a network of interconnected devices in IPv4, what if we could separate the devices into separate addressable networks? Imagine that my home network uses the above-mentioned range 192.168.xxx.xxx as well as your home network can use precisely the same range while all your home devices have their unique addresses within your network. This is, in simple terms, what NAT allows us to do.

Suppose your computer or phone from your home network needs to communicate with another computer connected to the internet. In that case, your router will apply an address translation - replacing your local network IP source address with its assigned IP while remembering your device request (using ports), so it can forward the communication from the remote machine back to your laptop or phone specifically once the response packets arrive to the router.

The diagram above depicts how a laptop with local IP 192.168.0.100 is trying to access the twitter.com server with IP address 104.244.42.65. So the packets traverse from the laptop to the local private home network router, where the first translation happens. Then, the second translation happens between the internal network of the Internet Service Provider(ISP) and the Internet. We can also see that ISP uses IP address 78.98.3.91 which is unique globally, and this address is one of the addresses from our original 4.3B bucket of IPv4 addresses.

Thanks to the NAT, there could be many more devices connected to the internet than the available range of IPv4 addresses. NAT was created and deployed in 1993, and a few years later 1998 - IPv6 was released.

Extra: IPv6

IPv6 was supposed to resolve the issues linked to the insufficient address space, while NAT and CIDR were supposed to extend the period for adoption from IPv4 to IPv6.

IPv6 has 2^128 addresses which in simplified terms gives (as per Google) 5*10^28 to every one of approx 6.5B humans on Earth, or 2^52 addresses for every observable star in the known universe :) so it is really a lot. Similarly to the IPv4, there are special ranges dedicated to various special use cases. Also, there is (obviously) no need for NAT anymore, as every device can have its own unique IP.

Despite the many benefits of IPv6, its adoption is much slower than anticipated. IPv6 hosts cannot directly communicate with IPv4 hosts and must communicate via special gateway services. Also, switching from IPv4 to IPv6 on an organizational level requires a lot of technical expertise and effort.

IP address - to block or not?

Now, steering back a bit to fraud prevention, to break down what blocking an IP address means, I will use a diagram. When a customer connects from his Laptop to the internet [IP: 192.168.0.100], Google or Twitter will not receive the actual IP of the laptop assigned within the home network but will receive only the assigned global IP of the Internet Service Provider [red highlighted IP 78.98.3.91]. The same will be true for a fraudster [IP:192.168.1.10] who uses the same ISP. Google or Twitter will see him with the same IP [red highlighted IP 78.98.3.91]. Of course, if criminal activity happens, police can ask ISP to identify the actual customer who connected to Google or Twitter at a given time, but this is handled by police as part of the criminal investigation.

So can we, or should we use IP blocking? I would break it down to 2 different use cases.

If your customers are almost entirely located in your country of operations, then you can block particular IPs assigned outside of your country without much impact. This also applies if you are servicing several countries but not the others.
If your customers are or can be connecting from anywhere, be aware that you are not blocking one particular customer or device, but you are probably blocking an endpoint through which you might be potentially blocking also your good customers.

Be extra cautious when blocking IPs assigned within your country of operations. In this case, I would advise using the IP blocking rule only as a temporary solution to fraud which you need to stop immediately and you have nothing more granular than an IP from which the attack is happening. Make sure to replace this rule ASAP with a focused one aiming at a more specific pattern than just an IP address (e.g., customer's behavior, tnx. amounts, or velocity).

IP address - can I use it for geo-location?

So far, we have spoken a lot about the technical aspect of IP addresses and how there are not enough addresses for all. But one crucial aspect of IP in fraud detection is linking the IP address to a particular geography.

I was lucky enough to find a secret mapping of how through IPv4 as well as through IPv6, your geo-location can be revealed - see below.

OK, ok, jokes aside. The above is complete nonsense, but it is a fact that IPv4 can be translated into geo-location as all the IPv4 addresses are allocated by Internet Assigned Numbers Authority (IANA), which is a department of The Internet Corporation for Assigned Names and Numbers (ICANN).

Allocations of the IPv4 address ranges are historically highly imbalanced in favor of the US [1]. Still, IANA assigns address ranges (lately based on their demand) from those available ones to Regional Internet Registries (RIRs), those are further cascaded to National Internet Registries (NIR) and those allocate them further to Local Internet Registries (LIR) or Internet Service Providers (ISPs) who can use them or rent them further to their customers - companies or even individuals.

For example, UAE has been provided with 2,838,400 IPv4 addresses, the Kingdom of Saudi Arabia with 5,504,256, and my home country - Slovakia with 2,555,392 [2].

From the above, you can already see that each allocated IP is assigned to a specific country via NIR and then further down to LIR or ISP, which will have multiple endpoints throughout a given country with allocated global IP addresses. Therefore the geo-location for a given IP address can be as granular as City or ZIP level.

Now, stating the above, we can't forget to highlight that there are ways to alter the IP address of the device we are connecting from - via VPN or Proxy servers. VPNs and proxy servers can and are used by fraudsters to mislead or confuse the counterparty. But VPNs can be used for legitimate purposes as well - for example, to bypass geo-based filters where certain web services are only provided to specific geographies - e.g. US content of Netflix is much broader than in other regions, so you might want to use VPN to gain access to US content.

In our case - fraud prevention - fraudsters, especially the tech-savvy ones, will undoubtedly use VPN not only to hide their actual ISP's IP but also to complicate back-tracking of their digital footprint by connecting via countries with favorable legislation or countries with limited capacity to investigate cybercrime.

To cite an example from Sift report: "The attack demonstrated how cybercriminals have remodeled typical ATO techniques to make a greater impact: using bots, proxy servers, and millions of compromised credentials, they were able to cycle through millions of usernames and passwords, while simultaneously and rapidly switching IP addresses in order to hide the origin of the attacks—and avoid getting blocked by typical rules-based fraud prevention systems. In fact, the largest group—or cluster—of blocked IP addresses grew by 50x between Q1-Q2 2021." [4]

Europol in its IOCTA 2023 report states "Virtual Private Networks (VPNs) are the most common services cybercriminals use to shield communication and Internet browsing by masking identities, locations, and the infrastructure of their operations."[5]

Geo-location-based rules derived from IP might also fire many false positives in situations when it is common among the general public to utilize VPN or proxy servers. Some countries with the highest usage of VPNs are listed above and GCC countries are among them [3].

So utilize the geolocation based on IP address with caution and make sure it fits your particular use-case considering the fact that IP can and often time is altered.

The below post highlights the most important aspects of IP geo-location: