Why not generate an actual GUID?

Feb 24, 2009 at 10:10 PM
So I needed to generate a guid to feed into the MSI builder and I called up the one in the 'Framework" namespace.  Much to my surprise, it didn't generate anything resembling a guid.  Sure enough, it's just using an RNG.  Or, with a different TaskAction value, a different RNG.

GUID generation is covered by a standard.  Calling these strings of pseudo-random bits GUIDs is misleading and dangerous.  The page linked below rambles a bit but toward the bottom describes exactly the situation we have here:
http://visualbasic.about.com/od/usingvbnet/a/rndmnums_2.htm

So, is there a .NET hook to the real GUID generator in Windows?  If so I haven't found it yet.  This is absurd.

Coordinator
Feb 25, 2009 at 7:59 AM
Hi Stephen, thanks for the interesting article. The Guid Task only exposes the Standard and Crypto .net methods for acquiring a GUID. When you say 'it didn't generate anything resembling a guid', what did you get and what were you expecting?

Regards

Mike
Feb 25, 2009 at 2:11 PM
After further review ... I was expecting a "type 1" UUID, the kind that is actually globally unique.  Apparently Microsoft prefers taking chances (on behalf of users) and uses type 4, which are often distinct but only "very unlikely to be duplicated in a given context".

Typical: fast, easy, and wrong.

Of course there are problems with type 1 uuids generted on virtual machines too, but that's solvable if you care.

The Create method produces a proper type 4 uuid.  The CreateCrypto method, however, simply generates 128 random bits.  It is completely non-conformant.

Feb 25, 2009 at 2:32 PM

from RFC4122


4.4.  Algorithms for Creating a UUID from Truly Random or
      Pseudo-Random Numbers

   The version 4 UUID is meant for generating UUIDs from truly-random or
   pseudo-random numbers.

   The algorithm is as follows:

   o  Set the two most significant bits (bits 6 and 7) of the
      clock_seq_hi_and_reserved to zero and one, respectively.

   o  Set the four most significant bits (bits 12 through 15) of the
      time_hi_and_version field to the 4-bit version number from
      Section 4.1.3.

   o  Set all the other bits to randomly (or pseudo-randomly) chosen
      values.

   See Section 4.5 for a discussion on random numbers.

4.5.  Node IDs that Do Not Identify the Host

   This section describes how to generate a version 1 UUID if an IEEE
   802 address is not available, or its use is not desired.

   One approach is to contact the IEEE and get a separate block of
   addresses.  At the time of writing, the application could be found at
   <http://standards.ieee.org/regauth/oui/pilot-ind.html>, and the cost
   was US$550.

   A better solution is to obtain a 47-bit cryptographic quality random
   number and use it as the low 47 bits of the node ID, with the least
   significant bit of the first octet of the node ID set to one.  This
   bit is the unicast/multicast bit, which will never be set in IEEE 802
   addresses obtained from network cards.  Hence, there can never be a
   conflict between UUIDs generated by machines with and without network
   cards.  (Recall that the IEEE 802 spec talks about transmission
   order, which is the opposite of the in-memory representation that is
   discussed in this document.)

   For compatibility with earlier specifications, note that this
   document uses the unicast/multicast bit, instead of the arguably more
   correct local/global bit.

   Advice on generating cryptographic-quality random numbers can be
   found in RFC1750 [5].

   In addition, items such as the computer's name and the name of the
   operating system, while not strictly speaking random, will help
   differentiate the results from those obtained by other systems.

   The exact algorithm to generate a node ID using these data is system
   specific, because both the data available and the functions to obtain
   them are often very system specific.  A generic approach, however, is
   to accumulate as many sources as possible into a buffer, use a
   message digest such as MD5 [4] or SHA-1 [8], take an arbitrary 6
   bytes from the hash value, and set the multicast bit as described
   above.