Thursday, July 8, 2010

XPCOM Generating SHA1 hash on a string

Reading some old code, thought this would be interesting. I tend to comment a fair bit, particularly when I'm frustrated or think I'll forget something. What follows are my comments from December 2008, and some updated SHA1 code.
/*****
* Generate a SHA1 hash of some data.
* Sources:
* Mainly based on example from https://developer.mozilla.org/en/NsICryptoHash
* With a conversion from string to array of numbers from http://www.cozmixng.org/retro/projects/piro/diff/backforwardthumbnail/trunk/content/backforwardthumbnail/backforwardthumbnail.js?compare_with=1438&format=diff&rev=1440
*
* Discussion:
* The basis of this function is the Mozilla documentation for nsICryptoHash,
* which does include a demonstration of generating a hash for a string -
* but it won't work (at least in this case). The Mozilla docs do some
* Unicode handling, however this generates an incorrect hash for our
* purposes. Instead, I found if I saved the data to a file, and then
* opened that file and passed the stream to nsICryptoHash, I would get the
* correct result. So, ignore the Unicode handling. Saving then opening a
* file is crazy, so a different stream was needed. StringInputStream was
* considered, and is used by FastServicesJavaScript (1) and Flock (2),
* but XUL Planet docs mention that the incoming string shouldn't contain
* null. This condition can't be guaranteed with the data we're handling,
* so another option was needed. The update method for nsICryptoHash can
* take an array of bytes. How can we convert a string to an array of
* bytes? Apparently, easily. Just create an array of the character's
* ordinal value using charCodeAt. This technique was found from
* cozmixng (3), and is a cut-and-paste of 3 lines.
* 1 - 2008/DEC http://wiki.fastmail.fm/index.php?title=FastServicesJavascript
* 2 - 2008/DEC https://lists.flock.com/pipermail/svn-commits/2007-September/012334.html
* 3 - 2008/DEC http://www.cozmixng.org/retro/projects/piro/diff/backforwardthumbnail/trunk/content/backforwardthumbnail/backforwardthumbnail.js?compare_with=1438&format=diff&rev=1440
* (3) is licensed under MPL, GPL & LGPL
* http://www.cozmixng.org/retro/projects/piro/browse/backforwardthumbnail/trunk/content/backforwardthumbnail/license.txt?rev=1429
********/

SHA1 = function (data) {
var hash = Components.classes["@mozilla.org/security/hash;1"]
.createInstance(Components.interfaces.nsICryptoHash);
hash.init(hash.SHA1);
// cozmixng split/map call
var byteArray = data.split('').map(function(c) { return c.charCodeAt(0); });
hash.update(byteArray, byteArray.length);
return hash.finish(false);
}

It's interesting re-reading this. I have found XPCOM components to be weird beasts, particularly when used in JavaScript. Looking back, this is partly from lack of understanding. I find understanding anything Mozilla will provide a headache; confirming a bug for example. It seems to me that once someone comes to understand anything in Mozilla, it is best if they blurt out that knowledge lest it be lost.

I really miss XUL Planet's documentation on sockets, that good old Pushing and Pulling. I feel lucky to have been able to catch that, and read their XPCOM reference before XUL Planet reached a decision to drop the content.

A few explanations on the SHA1 function, if this helps others. I tend to read data from sockets, and use nsIBinaryInputStream to do it. I don't use nsIScriptableInputStream, in fact I will probably never again use nsIScriptableInputStream simply because it doesn't handle null values. In the past, I've used the readBytes method (of nsIBinaryInputStream) which provides the result as a string in JavaScript. This may have been a good idea, maybe a bad one. Regardless, it is the reason for the above need to call split() on a string. The string is really just a collection of bytes, but now it's in JavaScript so everything has become a little odd, so we need to change the appearance of the string into an array of numbers (split to an array, map to numbers), and then let XPCOM handle the conversion so nsICryptoHash receives an array of bytes.

It seems so bloody obvious now, but reading the comments above replays part of the journey it took to get there.

No comments:

Post a Comment