Certificates and PKI ==================== administrivia: Victor will hold office hours on Monday, if you have quiz review questions quiz on Wednesday during lecture, in rooms 2-105 and 2-139 we will send email with room assignment? even if you don't receive email, still come to the quiz! recall from last time: Needham-Schroeder public-key-based protocol if A and B know each other's public keys, can agree on a fresh K_AB key assumption: trusted server S in steps 1 and 2: A -> S: A wants B's public key S -> A: { PK_B, B }_{SK_S} the response from S is called a certificate certificate usually binds a name to a public key, signed by some secret key entity that owns the secret key is called a "certificate authority" (CA) important property: doesn't matter where you get the certificate from it's OK to query S, but also OK to accept the cert from anyone else, including the person in the certificate (B) assuming public-key cryptosystem is secure, cert must have come from S why important? allows S to scale: sign certs, don't respond to queries what do we lose by accepting certificates from arbitrary parties? freshness: how do we know this isn't B's stolen key from last year? usual solution: cert includes a validity period how do we build a certificate infrastructure (typically called PKI)? single server works OK for MIT, but what happens at bigger scale? trust: noone trusts the same server / administrator / company / govt? performance: one server might not handle the load (even cert signing) naming: how do we identify all principals in the world (people, servers, cell phones, sensors, ...)? identity: how would the CA authenticate these principals? X.509: global public key infrastructure hierarchy of certificate authorities top-level "root" CA / | \ \ \ US CA / | \ \ \ MIT / | \ \ EECS / | \ \ NickolaiZeldovich links are certificates attesting to the public key of next level's CA name is the path from the root to the principal in question called the distinguished name (DN) as opposed to a common name (CN), which is just NickolaiZeldovich C=US/O=MIT/OU=EECS/CN=NickolaiZeldovich certificates contain lots of other information [openssl x509 -text < cert]: version#, serial#, signature alg, issuing CA, validity, subject name, subject's public key, subject type/restrictions (encrypt, sign, delegate, what paths can be delegated, etc) x.509 is widely used by SSL/TLS (https in a browser) only a small subset of x.509 features really used hierarchical names not used browsers just look at the CN= component, which is the URI (server name) many CAs trusted to sign any CN value in the browser 42 trusted CA organizations in firefox, each with several keys key point: what matters is the weakest link real trust chain is even longer: cert used to download firefox, etc how does a CA authenticate a CN= name when issuing a certificate? varies by CA, and varies over time a long time ago, had to fax or mail an official letter from organization not a high bar for someone that's slightly motivated even then, Verisign issued certificates for "Microsoft" to someone else now, just need to receive a secret code by email at webmaster@ again, weakest link is all that matters to attacker compelled certificate creation law enforcement agencies apparently force CAs to sign certs traffic interception devices mount a man-in-the-middle attack with cert CAs are all over the place: e.g. china can spy on anyone's SSL traffic? do we need a certificate infrastructure? big reason for PKI: translate principal names into public keys alternative plan: use public keys as principal names names in X.509 are so long, they're hard to remember/type anyway? whatever mechanism you use to obtain someone's name/email correctly, should also be used to obtain their public key, so don't need PKI? inverse problem is MUCH easier to solve: public key -> name, email, etc no need to trust anyone to tell us a public key's email address public key is self-certifying: just verify its signature on the email (still might need a large database where you might look up a public key) SPKI/SDSI: a decentralized public key infrastructure instead of having a global trusted CA, everyone (every public key) is a CA CA can sign certificate mapping name to some value (+expiration) value can be another public key, or another name (think symbolic link) e.g. my key is K_nickolai K_nickolai.srini -> K_srini [key] K_nickolai.victor -> K_srini.victor [name] can chain these arbitrarily x.509 can be thought of as a restricted case of SPKI/SDSI, where every name starts with "Verisign&Others." users get to choose who they want to trust, or can delegate choice to others K_nickolai.mit -> K_mit [by direct key exchange, perhaps] can now use MIT's notion of names when necessary can selectively delegate, based on sensitivity e.g. user may want to explicitly verify key for bank, email server but rely on Verisign's idea of what key should be for others avoids compelled certificate creation problems with SSL/TLS can have groups K_nickolai.857staff -> K_nickolai.srini, K_nickolai.victor of course others can refer to this group, and groups can nest certificate revocation what happens when a certificate is discovered to be invalid? e.g. CA realizes it gave cert to wrong person, or person's key is stolen? certificate contains expiration date, chosen by CA at issue time but no real basis for choosing this date: no problem yet at issue time certificates are just cached entries from CA's database 1. CRL (cert. revocation list) certificate authorities publish signed list of revoked certificates (or more likely, hashes or serial numbers of revoked certificates). problem A: list can get quite long. problem B: not on common path, so rarely used. some CA's CRLs used to return HTTP 404 not found errors. 2. client performs online check (e.g. OCSP) certificate-checking server must be highly reliable and scalable! real-world bug: OCSP, the online certificate status protocol wanted to allow scalability by allowing responses to say "busy" if the server is overloaded, it might not have time to sign "busy" so the protocol said that busy responses will not be signed adversary can just fake a "busy" response to use a revoked cert fundamentally, if online server is unavailable, then your protocol will be vulnerable to revoked certs or unavailable 3. server gets periodic proof of validity simple approach: issue certificates that are valid for short periods. this could get expensive: lots of public-key signing operations. more efficient way, due to Micali: represent validity using hash chains. if certificate is valid for N days, then: CA generates random x_N, remembers it computes x_{i-1} = H(x_i) embeds x_0 in the certificate, along with initial date to prove cert is valid on day (initial date + i), present x_i CA's server will only reveal x_i on day i low overhead: CA (hash), server (query once a day), client (hash) do we need a certificate infrastructure? (take 2) suppose we do want human-readable names after all assumption for PKI was that we need to look up name -> public key can we construct a cryptosystem where public key is f(name)? more realistically, suppose we have a trusted server S S has a public-private key pair (PK_S, SK_S) can we have PK_name = f(PK_S, name)? i.e., "whoever server S thinks name should be"? turns out the answer is yes, there are such constructions to obtain SK_A, A has to talk to server and prove identity given SK_S, server can compute SK_x for an arbitrary name x PK_x can be computed without talking to server or anyone else look at paper by Boneh and Franklin (2003), ``Identity-Based Encryption from the Weil Pairing'' for a construction one big application of IBE: key escrow / recovery