BlogCFC Enhancement
Technical
Not surprisingly, I suppose, a lot of the blogs I read are powered by BlogCFC. While writing my iRead pod, this fact made testing one particular feature somewhat difficult. BlogCFC doesn't offer one feature that I think is pretty important and
wanted my pod to be able to handle - recognition and handling of
the If-Modified-Since and If-None-Match request headers. I selected a set of random test feeds to consume, but because of the prevalence of ColdFusion authors in my reading material most of my feeds are running BlogCFC under the hood.
The Problem
The iRead pod consumes and parses any number of feeds so the impact on page rendering time could be considerable if every feed is consumed and parsed every time a blog page is requested. This is especially true if any of the feeds are large and do not adhere to the informal standard of including 10-25 feed items. One method for mitigating the performance impact of consuming a feed is to only consume it when it has changed.
The Solution
The means of "selectively" consuming a feed are varied, but one way is to pass in the If-Modified-Since and If-None-Match request headers. These values represent the state of the feed the last time it was last consumed. The server compares those request values with its own Last-Modified and ETag response header values, respectively. If the values match then a 304 Not Modified status code is returned with very little performance impact.
BlogCFC
Unfortunately, BlogCFC doesn't currently support this header matching. The feed content is cached in server memory for speedy rendering, but the entire feed content is returned with every request. In an effort to see whether I could make a really good package a little better, I decided to try to implement header matching. All of the changes shown below are made in /rss.cfm.
The Response Header Values
In order to match incoming request headers, I needed to decide how the corollary response header values would be derived. I chose the following:
- Last-Modified
- The pubDate value of the most recent feed item (the first feed item in the feed XML). That's logical enough. The pubDate value for the entire feed would also be a logical choice, but continue reading to understand why that didn't work.
- ETag
- This header can take pretty much any value, as far as I know, so my first thought was to simply hash the feed XML. Any change to the feed, then, would be reflected by a new ETag value. Unfortunately, the feed content changes with every request - specifically the pubDate for the entire feed is rewritten. So, rather than change how the feed is rendered, I decided to just hash the Last-Modified value. Hashing the value isn't necessary, but I find it more readable if the values aren't exactly the same.
Code Changes
Now that I knew how I wanted to do this, I needed to make the necessary changes to /rss.cfm. First, I needed access to the feed XML before it was rendered so I moved the call to application.blog.generateRSS() out to a cfset statement. After that, a bit of parsing to extract the response header values:
With the response header values now known, I needed to compare the values with the appropriate request header, if it was passed, and return the 304 status code if no changes had been made since the last request for the feed:
Finally, if I needed to serve the feed in its entirety, I had to ensure that the proper response headers are sent so that they can be stored by the aggregator that's consuming the feed:
Conclusion
A few changes to one file and BlogCFC is now more aggregator-friendly. No more wasted bandwidth on the server serving content that hasn't changed and no more wasted time for the client waiting for content it's already consumed. All of this, of course, assumes that aggregators are using the proper request headers, but my (unscientific) understanding is that most, and certainly the good ones, do that.





Loading....