A tale of two IPs

So, this one time, I started to randomly lose packets. No rhyme or reason. Just packets wouldn't go. They would stop for some time then start up again and the process continued for some time. Something like this,

$ ping 8.8.8.8
...
64 bytes from 8.8.8.8: icmp_seq=10 ttl=45 time=54.6 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=45 time=53.4 ms
64 bytes from 8.8.8.8: icmp_seq=12 ttl=45 time=53.7 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=45 time=54.0 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=45 time=53.9 ms
64 bytes from 8.8.8.8: icmp_seq=38 ttl=45 time=68.9 ms
64 bytes from 8.8.8.8: icmp_seq=39 ttl=45 time=58.1 ms
64 bytes from 8.8.8.8: icmp_seq=40 ttl=45 time=54.2 ms
64 bytes from 8.8.8.8: icmp_seq=41 ttl=45 time=61.0 ms
64 bytes from 8.8.8.8: icmp_seq=42 ttl=45 time=57.6 ms
64 bytes from 8.8.8.8: icmp_seq=43 ttl=45 time=66.0 ms
64 bytes from 8.8.8.8: icmp_seq=44 ttl=45 time=62.8 ms
64 bytes from 8.8.8.8: icmp_seq=45 ttl=45 time=61.1 ms
...

See, how no ping responses came between 14th and 38th. That was happening every few minutes or so. I decided to see what's up.

The previous command gave me an important clue. I now knew that the ping failing was due to external causes. If somehow my own system wasn't allowing pings to go out, I would have seen the following error, or something similar:

$ ping 8.8.8.8
...
64 bytes from 8.8.8.8: icmp_seq=2 ttl=45 time=52.8 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=45 time=53.1 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=45 time=52.8 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=45 time=53.2 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=45 time=52.9 ms
ping: sendmsg: Operation not permitted
ping: sendmsg: Operation not permitted
ping: sendmsg: Operation not permitted
...

I then looked at the usual suspects:

$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 8.8.8.8
nameserver 8.8.4.4

$ ip addr show dev eno1
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether c8:60:00:b4:78:ec brd ff:ff:ff:ff:ff:ff
    inet 10.1.36.72/24 brd 10.1.36.255 scope global dynamic eno1
       valid_lft 309594sec preferred_lft 309594sec
    inet6 fe80::3c1a:2fed:e9f:11b3/64 scope link
       valid_lft forever preferred_lft forever

$ ip route
default via 10.1.36.1 dev eno1 proto static metric 100
10.1.36.0/24 dev eno1 proto kernel scope link src 10.1.36.72 metric 100

All of these look pretty normal.

My breakthrough came when I had almost given up and started a packet capture and looking through it.

$  tcpdump -i eno1
...
23:58:25.452776 IP 10.4.20.103.webcache > dark-star.36650: Flags [F.], seq 2553161153, ack 2283786306
23:58:25.655872 IP 10.4.20.103.webcache > dark-star.36650: Flags [F.], seq 0, ack 1, win 194
23:58:25.859914 IP 10.4.20.103.webcache > dark-star.36650: Flags [F.], seq 0, ack 1, win 194
23:58:25.897154 ARP, Request who-has dark-star (c8:60:00:b4:78:ec (oui Unknown)) tell 0.0.0.0, length 46
23:58:25.897172 ARP, Reply dark-star is-at c8:60:00:b4:78:ec (oui Unknown), length 28
23:58:25.897363 IP dark-star.54151 > 10.4.20.222.domain: 4694+ PTR? 72.36.1.10.in-addr.arpa. (41)
...
23:58:46.034295 IP 10.1.36.10.45571 > 239.255.255.250.ssdp: UDP, length 172
23:58:46.148197 IP 10.1.36.99.42738 > 239.255.255.250.ssdp: UDP, length 172
23:58:46.252150 ARP, Request who-has dark-star (Broadcast) tell 0.0.0.0, length 28
23:58:46.252335 ARP, Reply dark-star is-at 74:86:7a:49:68:71 (oui Unknown), length 46
...

FOUND IT. The response to an ARP request for dark-star yeilds two different MAC addresses. And since dark-star was my system, it meant that some other system, somehow, has the same IP as me. I then found the tool arping that can be use to detect the exact same issue. Unsurprisingly, the test came positive.

$ arping -I eno1 -D 10.1.36.72
ARPING 10.1.36.72 from 0.0.0.0 eno1
Unicast reply from 10.1.36.72 [74:86:7A:49:68:71]  0.710ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)

Both of the IPs were same at the end. Look into the man page for arping on more details.