Archive for May, 2011

Non-WWW Domain Cookie Problems

If you come to this blog (etersoul.com), you may realize that the domain name appears without www preceding the etersoul.com on your address bar or location bar of your browser. Even if you try to type www.etersoul.com, it will redirect your browser to “etersoul.com”. Well, I did this intentionally some years ago when I first use this domain name and installing blog engine for this site. The reason was that I once visited a site called no-www.org, which is encourage people who own internet domain name to remove it’s “www” because “www” is considered as deprecated and so it must not be used.

If you visit that website, you’ll find the reason why they — the people behind no-www.org, appear to conclude that using “www” is deprecated:

By default, all popular Web browsers assume the HTTP protocol. In doing so, the software prepends the ‘http://’ onto the requested URL and automatically connect to the HTTP server on port 80. Why then do many servers require their websites to communicate through the www subdomain? Mail servers do not require you to send emails to recipient@mail.domain.com. Likewise, web servers should allow access to their pages though the main domain unless a particular subdomain is required.

Succinctly, use of the www subdomain is redundant and time consuming to communicate. The internet, media, and society are all better off without it.

I was agree with this statement so I implemented redirection when someone who visits my site by using “www” to non-www one. My site now classified as Class B at no-www.org with that scenario. Even I implemented this scheme to some of my projects. In fact, using this schema is not without negative effects.

Some months ago, I made some experiments that used my domain name and web browser cookie, which is data that are stored by the website on user’s browser. I found that using my domain that was not preceded by www might cause the cookie that was created for domain .etersoul.com to be sent to the server when the visitor also opens all my sub-domain sites. Well, it is by design that the cookie data from the domain can also be accessed by the sub-domain, unless you also describing the path of cookie. However it may cause other problems to appear rather than solutions.

Consider you use an application that is located at “/application1/” path with “example.com” domain, and the cookie is implementing this path (of course with the domain), it will still send the cookie data when sometime you must also place some application with the same path name, but under sub-domain (for example, “subdomain.example.com/application1/”). Another case, you may need that cookie to be accessed by other application in same domain, but you don’t want to place it under “application1” path. Actually, it just a matter of how the developer handles the cookie data so it will just ignore it when the data is not relevant for the application, but of course it create extra coding effort.

Some people may wondering, “why must I being bothered if the cookies are sent to the domain as well sub-domain?”. Well, the answer is depend on your website, i.e. how many user accessing your website, how many different applications that are hosted on your domain and sub-domains, and is your primary “no-www domain” total users larger than the sum of users of your sub-domains. If you just running a small website or personal website, with irregular visitors come and go (and perhaps some spiders, crawlers or spam bots) like this website, without sub-domain or just one or two rarely visited sub-domains, you could just ignore the problems. However if you really want to run a combination of serious website with tons of users, thousands hits per minute, many spam bots and crawlers that try to drain out your limited server power and limited network bandwidth, many sub-domains with some sub-sub-domains, with probably uncountable files that don’t “eat” your cookies, and your site always stops responding on busy time (and you know the problems in fact is your own website if you host your site on shared hosting), you will find out that cookie optimization is one of several methods to effectively cut these problems.

Here is the math for you. For example, you have 10 cookies with 20 bytes data each that are “baked” for your main domain, say that “.mydomain.com”, so you have to send the total of 200 bytes each time you need to request another file. On a page of your web application, a user must fetch 6 Javascript files, 4 CSS files, 29 images (some of them even just icons under 1 Kilobyte), so the user’s web browser need to send 40 different requests. But remember, each time the browser send a request, it also needs to send the data of cookies that are designated to that domain and the sub-domains, so that the total of data for sending the cookies alone is: 200 bytes x 40 request = 8 KB. A small number for 1 user with 1 page request. Multiply it with 1000 users that request averagely 5 pages each minute, the server and network have to handle about: 8 KB x 1000 users x 5 request = 40 Megabytes per minutes, just for the cookie data, assuming that the static files are also not optimized to be cached by the user’s browser. Huge waste of resource, of course.

That’s why even big websites like Google or Facebook don’t use no-www scheme so they can independently assign the cookies for the main site with “www”, the cookies for other sub-domains other than “www”, and the cookies that are globally accessible from the whole site. Another work around for this cookie problem is to use another domain to serve static contents that don’t need use any cookies at all, like Facebook that use fbcdn.net to serve all user uploaded images and static file. Of course using the CDN (Content Delivery Network), they also optimize many other things, such as compression, caching, etc.

Oh, another suggestion. If you really want to make a site, try to make a site that is accessible both by using www and no-www, however redirect the unused one to the scheme that you consider better to be implemented. For this site, I redirect www to no-www domain. This is for the sake of search engine optimization, and for some, as bandwidth conservation since every request from www to non-www is considered as different by the web browser and the web browser will never use the cache if you try to request the same static file from those two different scheme.

How about this site? I don’t think I really need to change my website to www one. I am comfort with the condition without www, but of course will consider the using of www on my other projects.

By the way, this is the first technical post that I write in English, so I apologize if there are some mistakes in grammar or structures. Google Translate? Nope, I didn’t use that kind of thing when writing this article since there will be many weirder word that Google Translate will produce. I just try to use my own sentences, and of course my own writing style. :) However if you are Indonesian and don’t understand this article, you could try to use Google Translate to help you. One more thing, please tell me if you found mistake on my post. Thanks. :)