<<< Date Index >>>     <<< Thread Index >>>

Re: MD5 To Be Considered Harmful Someday



Gandalf The White wrote:
> Unfortunately when "The Press" publicized the MD5 hash discovery by Joux and
> Wang it almost sounded like "The Press" was surprised to find collisions in
> the MD5 domain (intuitive to me, a limited number of outputs and a infinite
> number of inputs = Collisions).  I assume that a "good" hash would have a
> even distribution of collisions across the domain and that the larger number
> of bits for the output the better the hash (assuming no cryptographic
> algorithm errors).

Somehow there may be a lesson in this somewhere (not entirely sure for
whom (?)), or maybe not.  Anyway, I'm copying this to my offline
wiki/askSam thingie for future cogitation.

My point:  I'm sure "The Press" one way or another was told or got the
impression that MD5 hashes were the "answer to a maiden's prayers" with
respect to file security (against corruption).  Not sure exactly how
they got that impression, probably neither you nor I (nor any of "us")
told them that -- if asked, we might have said something with
reservations -- like, MD5 can assure no file system corruption with a
probability of failure of (for example) 1 chance in some very big number
of failing.  

Gradually that second clause gets forgotten, sometimes intentionally
(but innocently, I think: "don't worry about it, it'll never happen"). 
The further the news of MD5 travels, the more likely only the first part
("MD5 can assure no file system corruption") of the message gets
through, not the qualifier ("with a probability of failure ...").  Even
if the initial clause is crafted better ("MD5 can almost assure no file
system corruption"), the "almost" disappears as the message is
propagated.  (Not everyone would use the same words for this example,
but choose almost any other set of words and you get the same result.)

Not sure what can be done about it, but I guess awareness of the problem
is one step toward a solution, which is why I'm noting this to myself
(and the list ;-).  

regards,
Randy Kramer

Asides: 

1. Is part of the solution to always stress the "almost"?
(Although, without thinking about it.

2. We (mostly) are insiders.  What is intuitive to us is (usually?) not
intuitive to outsiders, depending on the subtlety of the issue.

3. At one point I read the thesis of the guy that wrote rsync (is name
isn't on the tip of my tongue at the moment, that's embarrassing) -- I
was fairly disappointed (and surprised) to find that the reliability of
rsync relied on a similar "probalistic" approach (can't find the right
words).  As I recall, this was not mentioned on the first page of his
thesis (I could be wrong), the README for rsync, nor as a user message
when the program is invoked.  IIRC, when the subject was brought up in
his thesis, it was effectively dismissed as "it'll never happen".  (The
words invoked, if not actually used, phrases like "the heat death of the
universe".)