I posted the following to the namedroppers list, and I'm crossposting here for comments. Again, this is "in progress", and largely shaped by discussions with many people, NOT a finished work. Comments are greatly appreciated.
One thought (again, I don't know how much this has been hashed out) is how robust do DNS resolvers REALLY need to be?
Any comments on the following?
I personally believe that resolvers should be robust to anything short of a man-in-the-middle ("Aware & Blocking"), yet man-in-the-middle is almost irrelevant for purposes of defending DNS.  Thus as long as the DNS resolvers are strong to anything short of the man-in-the-middle, its good in practice.
At the same time, such robustness must ONLY involve changes to the resolvers: the resolvers should be able to protect themselves, without relying on the authoritative servers making changes (although authoritative servers may suffer more load, especially servers which return some out-of-standard nonsenses.). 
And probabilistic protection is perfectly good. We don't need perfect protection against blind attacks, we just need to ensure that the attacker resources involved are far better spent elsewhere.
Against blind attacks, I believe the following principles should apply:
- No birthday attacks, of course.
- >=20b of entropy for purposes of protecting the transaction 
- >=40b of entropy for purposes of protecting the cache 
- Detecting when network entropy (port #) is reduced or eliminated by an in-path network device, allowing the resolver to accurately estimate the entropy it is generating in the requests.
- Detect and respond when there is .001% of a chance that a blind attack created one or more corrupt cache entries by either revalidating or voiding the cache.
- NO "race until win" scenarios for ANY given mapping in the cache. ALL blind attacks should be either "Race once per TTL" or "Race until another race until win". Blind attacks should always cost time as well a packets. 
ALL of the previous principles seem doable using existing techniques (port and 0x20 randomization, duplication, a single global counter to count unsolicited replies, and some seemingly simple data acceptance policies)
Against transparent attacks, I believe the following should apply:
If a resolver, within its normal transaction timeout, receives a second or subsequent reply with a differing value but which would have been otherwise accepted (matching on TXID, port, server, 0x20, etc), this must be considered an attack on both the transaction and the cache. 
If the resolver is an intermediary, it MUST forward the second response back to the original requester, with the TTL for the "new" value set to 1 second.
For the resolver's cache, any entry which MAY have been potentially affected by the second reply MUST be treated as corrupted by an attack. Thus, depending on how much information is recorded and available, various cache entries may need to be voided or undone (up to all cache entries for the domain in question, if more precise record keeping is unavailable). Even if this would not be considered an attack, it doesn't actually make sense to cache anything which may have been dependent on this response.
For the final client, it is up to the client to determine what to do, but the strong suggestion is to consider this an attack and abort any transaction involving this name.
However, if the end client does not treat this as an attack and abort subsequent TCP connections to the final IP, this loss of transaction protection means that the few authorities which trigger this alert through "normal" behavior will simply see their results not cached rather than not accepted by the client.
 If an attacker can man-in-the-middle the naming stream of a victim, he can probably man-in-the-middle the data stream. If the end protocol desired is weak to a man-in-the-middle, reliable naming does no good. If the end protocol is robust to a man-in-the-middle, it never really trusted the DNS record.
 This is why I'm not a believer in DNSSEC. It has both bad network effects (both the resolver and authority needs to change behavior, yet the changed behavior only protects the intersection of the two sets, so neither resolvers nor authorities have strong internal incentives to deploy), yet does not actually provide, to my mind, all that much meaningful system security because the other options available to a man-in-the-middle in practice.
 Blind transaction attacks, when combined with protections that provide "race once per TTL per name", as far as I can tell, only work against some seriously broken protocols. By specifying a different level of protection for transactions and the cache, this allows duplication to protect the cache while still successfully resolving (but NOT caching) sites which return different values on back-to-back queries when duplication is needed to meet the entropy budget.
 Anyone who wants to launch a TB DOS to poison my cache has better things to do.
 I believe that the proper policy on accepting data into the cache can provide this, combined with treating a failure to lookup a name as an event with a ~30 second TTL. This property is an important COMPLEMENT to increased entropy, not a substitute. "Race until win" does not reduce the number of packets an attacker requires, only the time period.
 This will occur with high probability whenever there is a blind attack that doesn't also block the authoritative server or create traffic with a DOS condition. It will ALSO occur with high probability if an attacker is able to directly observe the request to craft a reply, but is not fully in path. More importantly, a preliminary survey survey suggests that only a few authorities actually do return such nonsense.