I work on the campus of a major U.S. university. Our ‘open door’ network demands that we allow unknown entities on our campus - leaving our security posture weak at best. One way we combat this is to be able to produce forensic data about who is on our network and over what period of time. The method we use yields a mountain of valuable data, not only for the forensic purposes originally designed, but also usage and growth data. I am pleased to present this first in a multi-part serious of articles on how we accomplish this as well as the benefits.
The first goal in any network is to know who is on your network. No doubt your network is probably diverse with many departments often purchasing their own equipment. Even if you have policies against such things you may often suffer because someone at sometime in the past purchased a non-standard piece of equipment. This unfortunately always results in a non-heterogeneous network design. The first step then would be to control your CORE. This would be the primary switches or routers that all networks in all departments receive their network services. This CORE set of network nodes will be the primary concentration point of this paper and will henceforth, be know as the core. Likely, your core is at least homogenous. If that is the case, it will work.
Begin by determining the brand and type and how this relates to its SNMP query to obtain the ARP cache. For the sake of brevity, I cannot list all vendors and their SNMP OIDs. I’ll focus on one single vendor and type and show you how the logic is developed. From there you should be able to make any modifications desired.
Linux has a great package available which includes the necessary utilities you’ll need.
For Redhat users this would be the package titled: net-snmp-utils. For the purpose of this article I would also assume you have a network connection with all appropriate access to these core switches, root access to the Linux host, a mysql server up and running with a database schema provided later. Mysql server 4.1 or higher is suggested because of the specific command needed to provide database updating. INSERT … ON DUPLICATE KEY UPDATE is the specific command needed.
This will create the necessary table. I’ll assume you are using the database named arpcache for the rest of this paper.
CREATE TABLE `arp` (
`id` bigint(20) NOT NULL auto_increment,
`sourceip` varchar(12) default NULL,
`ipaddress` varchar(12) default NULL,
`macaddress` varchar(12) default NULL,
`arp` varchar(24) default NULL,
`firstseen` datetime default NULL,
`lastseen` datetime default NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `arp` (`arp`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Create a directory from which we can work and move to that directory.
mkdir -p /var/data/arpcache
cd /var/data/arpcache
Create a file with the IP addresses of all the core switches that we will poll. If you have multiple vendors that will not respond to the same SNMP commands, create multiple files. Name them appropriately.
cat > core.dat
10.0.0.1
10.0.0.2
10.0.0.3
Anytime you change your core design you will need to edit this file. Comment out any changes by starting the line with ‘#’ to denote a comment line. These lines will be skipped.
Create a temporary directory underneath your arpcache directory.
mkdir –p /var/data/arpcache/tmp
Create a bash command file to do all the work, which I will annotate with each line.
vi arpdownload.sh
And enter the following lines:
#!/bin/bash
#
# arpdownload.sh written by lmj http://freenetworksecurity.blogspot.com
#
#
# SETUP SOME VARIABLES
#
TMP="/var/data/arpcache/tmp"
CORE="/var/data/core.dat"
LOCKFILE="/var/lock/arpdownload.lock"
OUTPUT=$TMP/mysql.in
#
# Keeps the cron’d process from running over itself.
#
if [ -f $LOCKFILE ]; then
echo "Lockfile exists "$LOCKFILE
else
#
touch $LOCKFILE
#
# remove any stale data
#
[ -f $OUTPUT ] && rm –f $OUTPUT
#
# primary loop
#
for core in `cat $CORE | grep –v "^#"`
do
# ping the switch first to save time
ping –w 2 –q –c 1 $core > /dev/null 2>&1
if [ $? –eq 0 ]; then
# we will need this variable later.
export core0=`echo $core | awk -F'.' '{printf "%03d%03d%03d%03d\n",$1,$2,$3,$4}'`
# ping was successful, poll for data
#
snmpwalk $i -v 2c -c community_string at.atTable.atEntry.atPhysAddress | grep "Hex-STRING" > $TMP/$core.1
Note: substitute "community_string" with your real one.
The goal of this command it to produce some output we can massage to get exactly what we want. I piped it through a command to only pull out the lines I care about. My output looks like this at this point:
RFC1213-MIB::atPhysAddress.1.1.127.0.0.11 = Hex-STRING: 00 00 11 00 00 00
RFC1213-MIB::atPhysAddress.1.1.127.0.0.22 = Hex-STRING: 00 00 22 00 00 00
RFC1213-MIB::atPhysAddress.3.1.192.168.1.9 = Hex-STRING: 00 0B DB 87 84 E8
RFC1213-MIB::atPhysAddress.3.1.192.168.1.10 = Hex-STRING: 00 0D 92 B8 9C EA
RFC1213-MIB::atPhysAddress.3.1.192.168.1.20 = Hex-STRING: 00 B2 1F 6F 26 18
Your mileage may vary and you may need to add more grep pipes to isolate out only the lines you need to process. Still this output is not perfect, so we continue…
awk '{print $1"."$4"."$5"."$6"."$7"."$8"."$9}' $TMP/$core.1 > $TMP/$core.2
Now our output is slightly modified, but we aren’t even close to done…
RFC1213-MIB::atPhysAddress.1.1.127.0.0.11.00.00.11.00.00.00
RFC1213-MIB::atPhysAddress.1.1.127.0.0.22.00.00.22.00.00.00
RFC1213-MIB::atPhysAddress.3.1.192.168.1.9.00.0B.DB.87.84.E8
RFC1213-MIB::atPhysAddress.3.1.192.168.1.10.00.0D.92.B8.9C.EA
RFC1213-MIB::atPhysAddress.3.1.192.168.1.20.00.B2.1F.6F.26.18
tr "[A-F]" "[a-f]" < $TMP/$core.2 > $TMP/$core.3
Output is modified, but still not done…
Rfc1213-MIb::atPhysaddress.1.1.127.0.0.11.00.00.11.00.00.00
Rfc1213-MIb::atPhysaddress.1.1.127.0.0.22.00.00.22.00.00.00
Rfc1213-MIb::atPhysaddress.3.1.192.168.1.9.00.0b.db.87.84.e8
Rfc1213-MIb::atPhysaddress.3.1.192.168.1.10.00.0d.92.b8.9c.ea
Rfc1213-MIb::atPhysaddress.3.1.192.168.1.20.00.b2.1f.6f.26.18
tr "." " " < $TMP/$core.3 > $TMP/$core.4
Better, but still not done.
Rfc1213-MIb::atPhysaddress 1 1 127 0 0 11 00 00 11 00 00 00
Rfc1213-MIb::atPhysaddress 1 1 127 0 0 22 00 00 22 00 00 00
Rfc1213-MIb::atPhysaddress 3 1 192 168 1 9 00 0b db 87 84 e8
Rfc1213-MIb::atPhysaddress 3 1 192 168 1 10 00 0d 92 b8 9c ea
Rfc1213-MIb::atPhysaddress 3 1 192 168 1 20 00 b2 1f 6f 26 18
… and now all our variable we want are primed for the picking… We might also create our SQL input.
awk '{printf "INSERT DELAYED INTO arp SET id=NULL,sourceip=\"%s\",ipaddress=\"%03d%03d%03d%03d\",macaddress=\"%s%s%s%s%s%s\",arp=\"%03d%03d%03d%03d%s%s%s%s%s%s\",firstseen=NOW(),lastseen=NOW() ON DUPLICATE KEY UPDATE lastseen=NOW();\n", ENVIRON["core0"], $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $4, $5, $6, $7, $8,
$9, $10, $11, $12, $13}' $TMP/$core.4 >$TMP/$core.5
This will produce output like the following:
INSERT DELAYED INTO arp SET id=NULL,sourceip="010000000001",ipaddress="127000000011",macaddress="000011000000",arp="127000000011000011000000",firstseen=NOW(),lastseen=NOW() ON DUPLICATE KEY UPDATE lastseen=NOW();
Perfect for our purposes. Let’s copy this file to the output file.
cat $TMP/$core.5 >> $OUTPUT
# import the data into mysql
mysql arpcache < $OUTPUT else echo “Could not reach “$core done # end of arpdownload.sh Running this in a cron job every 5 to 10 minutes will give you a running snapshot of who is on your network. Type "crontab -e" as root and enter an entry like the following:
# download the arp cache from the routers every 5 minutes
0-59/5 * * * * /var/data/arpcache/arpdownload.sh
What kind of data will this yield? One could be curious about a particular IP address. Who, over time, has used this IP address (the original forensic purpose)?
mysql arpcache -e "SELECT macaddress,firstseen,lastseen FROM arp WHERE ipaddress='192168161052'"
+--------------+---------------------+---------------------+
| macaddress | firstseen | lastseen | +--------------+---------------------+---------------------+
| 000dxxx596de | 2006-12-15 13:25:32 | 2006-12-15 17:30:38 |
| 000dxxxda7c4 | 2006-08-25 11:26:18 | 2006-12-09 21:40:27 |
| 0014xxxfd5f7 | 2007-01-22 12:45:36 | 2007-04-30 18:15:44 |
| 0003xxx41dfc | 2007-05-11 16:10:37 | 2007-05-11 20:05:33 | +--------------+---------------------+---------------------+
Want to know where a particular host has been?
mysql arpcache -e "SELECT ipaddress,firstseen,lastseen FROM arp WHERE macaddress='000dxxx596de'"
+--------------+---------------------+---------------------+
| ipaddress | firstseen | lastseen
+--------------+---------------------+---------------------+
| 192168161052 | 2006-12-15 13:25:32 | 2006-12-15 17:30:38 |
| 010xxx019155 | 2006-12-10 18:30:12 | 2006-12-11 01:25:09 |
| 010xxx016107 | 2006-11-28 16:30:14 | 2006-11-28 22:45:10 |
| 010xxx027222 | 2006-09-29 13:20:10 | 2006-09-29 19:05:09 |
| 010xxx029224 | 2007-01-23 10:25:08 | 2007-01-23 15:45:10 |
| 010xxx031227 | 2007-01-25 10:25:10 | 2007-01-25 15:45:11 |
| 010xxx027184 | 2007-02-08 10:35:10 | 2007-02-08 14:30:12 |
| 010xxx023146 | 2007-04-24 10:45:09 | 2007-04-24 14:45:11 |
| 010xxx018169 | 2007-05-08 08:00:11 | 2007-05-08 16:15:11 |
| 010xxx030178 | 2007-06-04 10:25:10 | 2007-06-08 14:25:10 | +--------------+---------------------+---------------------+
How many were potentially on at a given point, remove the COUNT() if you want to know who they were.
mysql arpcache -e "SELECT count(macaddress) FROM arp WHERE lastseen > '2007-04-15' AND firstseen < '2007-04-15'"
+-------------------+
| count(macaddress) |
+-------------------+
| 23849 |
+-------------------+
Because of the nature of the specific INSERT statement explained above this database should be self optimizing. It will only insert records when a particular IP ADDRESS and MAC ADDRESS relationship change. On to an advanced topic… How can we use this data to protect us against attacks that specifically use ARP poisoning to capture traffic? Specifically the NetSniff Trojan? (READ MORE)
With some slight modification of our original script we can include a simple check to see if any single mac addresses are hording multiple ip addresses. Inside the loop above insert the following line, after the "tr "." " " < $TMP/$core.3 > $TMP/$core.4" line:
awk '{printf "%03d%03d%03d%03d %s%s%s%s%s%s\n",$4,$5,$6,$7,$8,$9,$10,$11,$12,$13}' $TMP/$core.4 > $TMP/$core.check
This stores relevant data into a separate file called $core.check which we will deal with at the end of the file.
Now add this at the end of the file:
cat $TMP/*.check > $TMP/check.all
/var/data/arpcache/check.sh $TMP/check.all > $TMP/email.txt
if [ -s $TMP/email.txt ]; then
rm -f $TMP/email-ad.tmp
echo "SUBJECT: ARP Download Report” >> $TMP/email-ad.tmp
echo "Content-type: text/plain" >> $TMP/email-ad.tmp
echo >> $TMP/email-ad.tmp
echo "" >> $TMP/email-ad.tmp
echo "ARP Cache retrieval report:" >> $TMP/email-ad.tmp
echo "" >> $TMP/email-ad.tmp
echo "" >> $TMP/email-ad.tmp
cat $TMP/email.txt >> $TMP/email-ad.tmp
/usr/sbin/sendmail -r security@example.edu security@example.edu < $TMP/email-ad.tmp fi Notice we are calling an external script /var/data/arpcache/check.sh, which is here:
#!/bin/bash
# Check for duplicate MACS
#
TMP=/var/data/arpcache/tmp
#
# MAC addresses that we know about that take up multiple MAC addresses
# such as firewalls, vpn concentrators, modempools, etc, one per line
# these mac addresses would be in this format:
# 0000xxx7ac04
# 0000xxx7ac07
# 0000xxx7ac0a
# 0000xxx7acff
WHITELIST=/var/data/arpcache/check.whitelist
#
CHECKFILE=$1
#
# threshold of how many macs an IP can legally have before we alert
#
LIMIT=20
let LIMIT=$LIMIT-1
if [ -f $CHECKFILE ]; then
sort -k 2 -k 1 < $CHECKFILE > $CHECKFILE.1
awk '{print $2}' $CHECKFILE.1 | sort -u > $CHECKFILE.2
grep -v -f $WHITELIST $CHECKFILE.2 > $CHECKFILE.3
mv -f $CHECKFILE.3 $CHECKFILE.4
for mac in `cat $CHECKFILE.4`;
do
SL=`expr length $mac`
if [ $SL -eq 12 ]; then
grep -w $mac $CHECKFILE.1 > $CHECKFILE.check.1
CNT=`wc -l $CHECKFILE.check.1 | awk '{print $1}'`
if [ $CNT -gt $LIMIT ]; then
echo "MAC Address "$mac" appears "$CNT" times:"
cat $CHECKFILE.check.1 | awk '{printf " %d.%d.%d.%d\t%s:%s:%s:%s:%s:%s\n",substr($1,1,3),substr($1,4,3),substr($1,7,3),substr($1,10,3),substr($2,1,2),substr($2,3,2),substr($2,5,2),substr($2,7,2),substr($2,9,2),substr($2,11,2)}' | awk '{printf " %-16s %16s\n",$1,$2}'
fi
fi
done
else
echo "No file to check"
fi
A great use for this data: How about a search for any stolen data that might appear in our arpcache? I'll leave that for next time.

