play with micolog -- handle 404
As soon as I set up this blog, I inserted Google ads, and I want Google to index my site asap. I submitted site url here: http://www.google.com/addurl/ , and wanted to submit sitemap as well. Then I was stuck with the verification of the ownership of the blog.
Google Webmaster has two ways to verify site ownership, one by creating a google requested html, and one by adding google specified meta tag. I chose the html method first, however I run into the issue "Your site doesn't return a 4xx HTML status code for non-existing URLs". Then I decided to use meta tag verification. But Google just trapped itself in the previous error, and keeps showing me the same error message even if I have switched verification method.
So, I gave up and decided to fix the problem of 404 page with 200 status. The code of micolog is quite easy to follow and understand. Also the logs from GAE's admin console is very helpful to confirm my guess (although I only checked it after I already made the change and about to deploy). And the problem is after the first visit, error (or not existing) page will be cached as well, and the cache will be served immediately for the 2nd visit even without going through the cycle to set the http status code to 404.
Several solutions come in mind: 1. disable the cache for error page; 2. define a new way to cache error page; 3. change the cache method to cache http status code as well and set it before serving cache.
Actually it turns out method 1 and 2 didn't work well because of the setup of url dispatch and the underlying design of micolog, but it's fun as I read more of the code and find out why it won't work. So I have to change the cache method to include http status. Then I run into trouble to get the http status code, GAE's Response object doesn't offer a way to get the status code. So I decided to fight with it and to retrieve its private field (by using object._(classname)__(variable) ):
status_code = response._Response__status[0]
Another way is to set and store a status code somewhere in the micolog's handler object. But I feel it's better GAE's Response object can provide it. Anyway, here is my current change and it works.
Index: base.py
===================================================================
73c73,76
< skey=key+ request.path_qs
---
> if key:
> skey=key
> else:
> skey=request.path_qs
80a84
> response.set_status(html[2])
89c93,94
< memcache.set(skey,(result,response.last_modified),time)
---
> status_code = response._Response__status[0]
> memcache.set(skey,(result,response.last_modified, status_code),time)
Custom Search