<<< Date Index >>>     <<< Thread Index >>>

Re: Sun single-CPU DOS



:Sun says it is jabber, which is why I put it quotes. Since they have not
:replicated in lab, they are jumping to conclusions. Yes, I agree,
:it is very specific and the backline engineer usage appears 'stretching things'
Most Sun adapters have an actual jabber counter that netstat -k will 
spew out for you.  You can eliminate ambiguity easily enough.  Here's
an example I Google'd for:

netstat -k eri0
eri0:
ipackets 525571 ierrors 365 opackets 8446 oerrors 0 collisions 85
ifspeed 10000000 rbytes 73324309 obytes 1118022 multircv 99205 multixmt 
6 brdcstrcv 415863
brdcstxmt 10 norcvbuf 0 noxmtbuf 0 inits 4 rx_inits 8 tx_inits 1
nocarrier 1 nocanput 0 allocbfail 0 drop 321 pasue_rcv_cnt 0
pasue_on_cnt 0 pasue_off_cnt 0 pasue_time_cnt 0 txmac_urun 0
txmac_maxpkt_err 0 excessive_coll 0 late_coll 0 first_coll 35
defer_timer_exp 0 peak_attempt_cnt 0 jabber 0 no_tmds 0

(see, "jabber")

tx_hang 0 rx_corr 0 no_free_rx_desc 0 rx_overflow 0 rx_hang 0
rx_align_err 64 rx_crc_err 19 rx_length_err 0 rx_code_viol_err 0
bad_pkts 321 runt 40 toolong_pkts 279 rxtag_error 0 parity_error 0
pci_error_interrupt 0 unknown_fatal 0 pci_data_parity_err 0
pci_signal_target_abort 0 pci_rcvd_target_abort 0 pci_rcvd_master_abort 0
pci_signal_system_err 0 pci_det_parity_err 0 ipackets64 525571
opackets64 8446 rbytes64 73324309 obytes64 1118022 pmcap 4

:In this case it's tcp/ip.
:
:step 1) telnet to router
:step 2) ping some remote device on a fast link (like  2GB IP/Sonet)
:step 3) watch as returning tcp/ip telnet stream DOS's the sun.
:
:it is not the cisco ping the is DOS'ing the sun, it is the return stream
:of !!..!.!!!....!!!..!!!...  (ad infinitum)

Ahhh, so it's just the return traffic from the Cisco printing out all
those !!..!.!!! stuff (corresponding to whatever it is the the Cisco is
pinging) that causes all this?  Nifty!  I didn't think that the Cisco
could print that fast!  I'm fairly certain it should rate-limit/sample
that output (unless some automated thingy actually cares about that 
output coming from the Cisco).

:the nagle comes into play in the tcp-stream not coalescing all the
:single char tcp/ip packets each with a single ! or . in it.

Makes perfect sense now that I get what the traffic is.  As an aside,
the Nagle algorithm was designed with telnet explicitly in mind, per
RFC 896.  But, a lot of folks these days use telnet for stuff apart
from interactive use, and I could see someone wanting to disable it
for performance' sake.  For bare-bones stack implementations, Nagle
may not be there at all.

:right. totally agreed. it should not cause the machine to totally lock up.
:(I specified wrong earlier, btw. Break still works, just nothing else does)

That makes it sound even more like an interrupt issue rather than some
overall system lock.

:> In this particular case, if you're talking about ICMP, and there
:> really isn't a "jabber"/physical layer issue afoot, the idea is for
...
:getting that someone to not slap a 'jabber' label on things and
:dismiss it out of hand is where I am currently frustrated beyond
:belied.

Beyond netstat -k, you can probably use lockstat or other kernel
profiling tools as I mentioned in my earlier post to give them a
good idea of where the bug really is.  Interrupt issues aren't 
always going to be cut and dried.  There could be some particular 
flavor of IOS, network adapter, media type, CPU, OS, etc. that 
is more prone or less prone to the problem.  

:well, yes, this was all quite accidental in the first place.
:The solution is really quite easy, don't disable nagle on the
:cisco in the first place. However, I'm much more concerned about
:the implications of a normal user being able to DOS the machine and
:Sun not caring enough to do due dilligence to address the issue.

Judging from the amount of times we've exchanged emails (I should
have asked for a network diagram sooner to help visualize this :) ), 
sometimes it's not so easy.  And "what is or isn't a DoS" can be a
grey line where reasonable people may differ.  I could readily see 
someone saying "if you point a stupid amount of traffic at something
it dies, have you considered just not doing that?".  

-- 
 Mail: mjo@xxxxxxxxxxx  WWW: http://dojo.mi.org/~mjo/  Phone: +1 650 933 9487
 =--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--==--=
"Even in the future, nothing works!"                             -Dark Helmet