Tuesday 29 December 2009

network buffers

Next interesting part is looking at the interconnect fabric, which in this case is 1Gigabit Ethernet NIC x2  + Switch, using UDP at the protocol level. Test is conducted using a stock 2.6.30 kernel + socket library and standard Rx/Tx socket buffers. What were testing here is the time it takes to send each message in its own UDP frame, so sendto() on each ITCH message. The results are quite (in a bad way) surprising.



Here we see that send is well behaved with an average around the 1200-1300ns mark with a fairly small standard deviation. Translates to 3300 cycles @ 2.6Ghz(test machines freq) which is alot... but considering how many libraries and theres a kernel call in there not that unreasonable.

OTOH recv is pretty bad. If send send`s at full rate, then it drops a huge clusterfuck of packets so we throttle send to 1 message every 10,000ns(10us) enabling send to receive every packet.



Thus on the send side,(above) we get the expected peek around the 10us mark, with a fairly small stdev, but whats up with that second spike? In contrast, the recv histogram is a mess.



There`s some clear reciprocal send behavior here - the 2 spikes around 10us, but the spread is all over the map. Obviously something we really don't want when latency is so critical.

Using 1msg / 10us isn't a luxury we can afford, so what does it look like when send runs flat out?



Answer: quite interesting. The huge spike is centered around 500ns, 1335 cycles @ 2.67Ghz(test machine freq) which is promising, however the large flat chunk from 5000-10000ns would appear to be the problem. e.g. why its taking so long is hmm wtf? Looking at the network stack stats, the send machine drops 0 UDP packets while the recv machine drops a ton. On the order of about 40% of the message volume so its pretty certain this is an Rx problem.




Changing the hisogram to graph messages dropped against time(above) doesn't help much. Basically the same graph, a little flatter in parts but not very informative. Digging a bit deeper, plotting the ratio of # packets dropped for each time bin, the above graph makes more sense.



Green = sequential
Red = 1 missing packets
Blue = 2 missing packets
Orange = 3 missing packets
White = 4 ore more missing packets.

So its easy to see theres little correlataion between recv time and the number of packets dropped, which... is not what I expected. e.g. longer recv takes, the higher the number of dropped packets. The one unknown factor here is UDP packet re-ordering, as by definition order isnt gaurenteed. However, as the network topology is trivial, nic->router->nic find it hard to beleive this would be a major factor. Next question is why this is occouring and what can be done about it.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.