26 August 2010

Cryptography (of PCI data) is Hard

I use to work in card payments and worked with the crypto, even participated in ASC X9F6.  One of the things that I thought about for over 6 years was how to encrypt the card number (personal account number - PAN).  Sixteen decimal digits - 31 bits of entropy max.  There are only a few initial numbers - 4 for Visa, 5 for Mastercard, etc - plus the cards for a given financial institution are all going to start w/ the same Bank ID Number (BIN).   The hard problem comes if you are using the PAN for the primary key in a database.  If not, you can usually find some other data to XOR it with.  But that data must not be predictable.

However, the ogre of PCI DSS compliance has driven everybody to smear a little soothing encryption on their pain.  The result is some crummy, non-X9F6 approved, encryption schemes.  The lid got ripped off that last week in Storefront Backtalk by Evan Schuman.  Everybody in retail and payment cards subscribes to and reads Storefront Backtalk.   So it really ruffled some feathers:

http://www.storefrontbacktalk.com/securityfraud/encryption-implementation-really-matters/

As an aside, we solved the PAN encryption as primary key problem by generating a random salt every time we generated a working storage encryption key.  We would encrypt them both w/ the KEK.  Then we'd XOR the salt with the PAN before encrypting or after decrypting it.  This effectively doubled the key strength.  The working storage encryption key was rotated at least annually.  The only hard part is that you can't take the system offline to rotate the keys, but we handled that by retrying w/ the old key for a record not found condition, assuming that the key rotation window was of short duration.

Meanwhile, the cryptographers have been developing format preserving encryption (FPE), led by Voltage.  If you haven't seen this, here is link to a paper about Voltage's FFX: http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/ffx/ffx-spec.pdf   This is off the proposed modes page: http://csrc.nist.gov/groups/ST/toolkit/BCM/modes_development.html

The PCI have been very nervous about this use of crypto, but since every Tom, Dick and Harriet in the point-of-sale business has been jumping on it, it is hard to stop.  Heartland Payment Systems has been beating the drum for this every time their CEO, Bob Carr, gets a chance to talk for the last two years.  X9F1, the cryptographer part of ASC X9F, has been glacially thinking about it, as has NIST CRSC, with no yea or nay yet.

Regardless of whether FPE is sound or not, encrypting the whole transaction without XORing it with unpredictable data is madness.  We'll have to see how this plays out.  After the RBS Worldpay breach a couple of years ago, where the crooks got malware on the payment system, sniffed the traffic to the hardware security module and built a dictionary attack against the PINs, it is clear that the bad guys have some decent cryptographers and cryptographic engineers in their midst.

23 August 2010

The Eternal Tao: Push vs. Pull

The Tao is the concept of opposites, articulated historically by Laozi (the Lao Tzu of my youth).   The two opposites are referred to as yin and yang.

I am a Taoist in that I can see large patterns driven by opposites that seem diametrically opposed, and seem to always manifest, one with the other.

Right now on the Internet, another major paradigm shift appears to be happening in a shift from push to pull.  In this model, push is moving information to where the application is that wants to use it, whereas pull is the application going to get information when it needs it.

Pull is enabled by cheap, fast, global communications, and standard ways to represent metadata - the data about data. (Ouch!  It always makes my head hurt a little bit to say things like that, but it's true.)  I am reading Pull: The Power of the Semantic Web to Transform Your Business by David Siegel.  This a terrific book, best I've read in awhile, and when I am done, I will be writing a review in this blog.  Pull and metadata are the whole topic of Siegel's book.

Yin and yang seem to lob reality back and forth between them like a cosmic tennis game.   It has been compared to a pendulum, but I think of it more as a spiral - the classic Hegelian dialectic: thesis, antithesis, and then synthesis.   The two opposites usually seem to be tied to some third concept, at right angles to both, and that is what produces the spiral.

There is an old saying about remote operations, attributed to Don Box.  If the cost of local function call is 1, then the cost of a call that crosses local process boundaries is 1,000 and the cost to cross machine boundaries is 1,000,000.  A big part of this is because every communication transaction has two costs: channel seizure and then data transfer.  For short transactions, the channel seizure cost tends to be dominant.  That's why we open a file once and then read/write from it many times or open a TCP/IP connection, set up the SSL, and then use it for awhile.  When you cross process boundaries, it takes a major context switch.  When you cross machine boundaries, it requires remoting, using stubs and ties at many levels.

Caching can go a long way to hide this cost.  That is why modern computing systems use lots pools of pre-built, expensive objects, such as connections to files, databases, and remote machines, and keep pools of recently retrieved data from remote processes and machines.  The hard problem here is cache-coherency, which means keeping the local copy in the cache in sync with changes to real data.  Fortunately lots of work has been done on this, so we have a lot of tools in our toolbox.  A meatspace example of the caching problem might be as simple as finding out that a relation has a new child that you didn't know about, or as complex as figuring which is the final last will & testament of a deceased person.

Still, cheap, fast, global communications, and standard ways to represent metadata are all reducing the channel seizure cost to go get the data when you need it.  Thus new application features are becoming possible and the Internet is becoming a more lively, integrated environment.  I find it really exciting and fun!