darkness

Saturday, 22 October 2005

I hate traffic monitoring

darkness @ 09:46:29

One of my clients has a Linux router. This router does forwarding/NAT/packet filtering for their local network. It also handles several IPsec connections:

  • Road warriors doing L2TP-over-IPsec.
  • GRE-over-IPsec to three of their other sites.
  • Straight IPsec to their parent company.

To clarify that third one: packets come in on the LAN interface of the Linux box, get wrapped up for IPsec and sent across the Internet to their parent company. I think this may be rightly called a “secure gateway” setup.

People have been having slowness over the VPN and I want to know why. Particularly, I’d like to graph the throughput of:

  • Their Internet T1
  • The tunnels to their parent company

(I say “tunnels” because technically it’s slew of different IPsec SAs to their home office.) With these graphs I could say, “you’re simply pushing too much data over the VPN” or else “things other than the VPN are using your bandwidth.” Then I could put in some QoS stuff and see the difference.

26sec, A.K.A. NET_KEY (or NETKEY), makes all your inbound IPsec traffic appear twice: once as the encrypted packet, once as the decrypted packet. For example, you might see the ESP packet come in, then the unencrypted contents of that ESP packet. This is pretty troublesome (for packet filtering as well, but I think I’ve covered that elsewhere).

Looking at the output of setkey -D, I see there might be some hope for gathering statistics. Of course, I don’t really know what I’m looking at, and the documentation doesn’t really help, but there looks to be some sort of counters. I wonder if I can separate RX vs. TX with that.

In any case, there’s also not a lot of help from graphing utilities. I think I’ll have to learn more about RRDtool than I’d like. I’ve got some bookmarks on monitoring that might be relevant. Cacti might help me out. NetFlow/sFlow might help me out. I’ve got some links there to some NetFlow (I think) probes for Linux.

Maybe I should just write something that captures packets, takes a series of filters, and outputs separate sets of statistics for those filters every N seconds. For example, you could give it BPF-like filters such as interface eth1 and ip proto esp and src my.parent.com and associate each filter with some sort of data output. That output might be to a flat file, a pipe, a socket, or an RDBMS. (I’m not saying all of those, just whichever I need to interface with some kind of graphing software. Or RRDtool.) I think I could possibly do this with one of the NetFlow probes, but it seems like I’d have to run one instance for each filter, which seems like a bad idea what with many NetFlow probe processes each getting woken up to process a new packet. One process could do it a lot better.

Maybe I’m wrong about NetFlow, though. Maybe I just have it send all its stats to a NetFlow collector, then it’s the collector’s job to pick out all the different reports I want to see.

Of course, I’ll still have the problem of separating out “is this packet I’m seeing one that was decrypted from an IPsec packet, or a regular packet?” There are a couple different heuristics I could use. One is to look at the source/destination IP addresses for internal addresses. For filtering that’s a horrible idea, but for monitoring it might be doable (ignoring the idea of someone deciding to poison my monitoring reports with spoofed packets from the Internet). Right now I set the nfmark on ESP packets coming in, and that nfmark gets copied into the encrypted packet. BPF/libpcap can’t see this mark, I think, and I wouldn’t expect NetFlow to report it. However, I think I did find a NetFlow probe that actually collects packets sent to it from iptables’ ULOG, and I see ULOG supports up to 32 groups of packets. So I could match nfmark, ULOG to one group, etc. I wonder how CPU usage goes with that.

Of course, in general I’m not looking forward to figuring out how to get something-or-another to graph this data, even if I do come up with data.

Couple more notes about iptables: iptables has all sort of wicked cool things, both in the standard version and available as patches. Unfortunately, RHEL/FC kernels don’t seem to include some of them. For example, ROUTE is not included with RHEL4 or FC4 as near as I can tell. connmark/CONNMARK aren’t included with RHEL4. Give us more flexibility guys! I’ll add that the IPsec-related matches really need to get in to iptables and then my distributions. Also, what’s this “raw” table I just found? The only thing that seems to use it is the NOTRACK target? Huh.

I had this crazy notion of rewriting the input interface on IPsec packets to something like dummy0, to make filtering/monitoring easier. I was going to do that with -j ROUTE --iif dummy0, but it says it doesn’t modify the packet. Even if it did, I’m not sure about this, but I think the raw socket that programs use to catch traffic probably gets a copy of the packet before it makes it to netfilter.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress