HotlinkBlocker 1.0.0.63 Released

January 28, 2009 by iamanton

New build of HotlinkBlocker acquired one important feature: from now on you may exclude client IP address from the process of signature generation.

Hotlink2

This allows to prevent issues with AOL browser (not only with it actually) that sends requests to html and images from different IPs.

Introduction to mod_gzip

January 23, 2009 by develworld
(via HeliconTech Official Blog)

Every day millions of web-servers around the world receive billions of bytes of network traffic. Each year the speed of Internet connections increases. Hosting providers offer perfect tariffs. It seems the mankind is going to forget about traffic saving problem and sink it to oblivion. But even with HDD volume growth the users still haven’t forgotten about archivers. The same thought can be applied to web traffic. You can say: “I have 10 Mbit unlimited connection and the problem with traffic saving isn’t mine”.  Yeah, 10 Mbit is very well. But what will you say if you get to know that it is possible to save more than 60% of the traffic? First of all lots of users have less than 10 Mbit connection. Indeed the growing popularity of mobile devices selects a main role for the traffic saving. A lot of PDA, cellphone and smartphone users would say ‘thank you’ if your web-server is saving their traffic and money. To sum up I’d like to say that traffic saving is timely and important process in modern web-server technologies.

Until recently HeliconTech had one specialized solution for content compression – HeliconJet. We have decided to include its functionality to our new product – Helicon Ape , accounting for its importance. So far as Ape stands for APache Emulation, it’s very important not to invent new syntax  nd directives but use existing Apache assets.

There are 2 popular compression modules – conventional mod_deflate and mod_gzip. The last one is written by third party developer and is not supplied with Apache. We have decided to implement both modules because users are using them to the equal degree. At the moment only basic
mod_gzip functionality is realized but we are planing to extend it in the nearest future. Technically Ape will have one compression module which will be able to support both mod_gzip and mod_deflate syntaxes. Our primary goal is to give you an ability to easily use existing Apache configuration without any changes.

Let’s have a look at basic content compression principles and  mod_gzip operation. This module applies GZIP format which uses Deflate compression algorithm. The module is based on .NET version of the popular library ZLib. Please note, Helicon Ape is written in managed code only!

Web-client (browser) exchanges technical information (so-called HTTP headers) with web-server. These headers contain a important information helping client and server get mutual understanding. Client can point to accessible data type and needed content. Taking into account client abilities the server prepares and sends the content. After that technical information helps client understand what to do with the server response.

But we are not gonna dive deep into HTTP protocol subtleties as there are tons of info on this topic in the Internet. Lets recur to mod_gzip . General scheme of its operation is given below:

General scheme

As you can see not only server takes part in considering whether to compress content or not. It is easy to understand ’cause if browser isn’t capable of uncompressing GZIP, then all mod_gzip operation will be senseless and the user will get rubbish. Web-client must send Accept-Encoding header with gzip, x-gzip or deflate value to let mod_gzip know whether the client supports compression.

In its turn, if the module makes a decision to compress content, it sets Content-Encoding: gzip header to inform the client that GZIP uncompression must be used. So, each chain on the scheme above plays
important role.

But to better understand mod_gzip logic, please have a look at this flowchart:

mod_gzip logic

The sequence is used by mod_gzip to make compress/not compress decision. We’ll now give a brief explanatin of each stage:

  • When request comes to the server mod_gzip (if it’s ON) can start its “dirty”
    work.
  • Firstly, the module defines whether the content is already compressed. If it is, mod_gzip
    leaves things as is.
  • If it’s not, the module analyses request headers sent by the client. mod_gzip
    can move on only if there’s Accept-Encoding header with gzip, x-gzip
    or deflate value.
  • On the next step the module performs check set by
    specific directives inside configuration files. Based on results of these
    checks decision about content compressino is made.
  • If it’s necessary to use GZIP, the module will SET Content-Encoding: gzip header, ’cause
    otherwise the client may fail to process server response correctly.
  • Besides, there’s a special Vary header in which mod_gzip specifies what its actions depend on
    (Vary: Accept-Encoding). This header is used for caching, so it’s detailed description will appear in the
    upcoming articles.

It’s possible that in next versions will have slightly different logic, but we’ll surely inform you about that.

Resume

This article is just a brief introduction to Helicon Ape mod_gzip module.
We are thinking of writing much more material on that and other topics to help you use our little agile monkey (Ape) easily and efficiently.

Best wishes,
HeliconTech Team

Guide: Example of mod-cache application

January 21, 2009 by iamanton

(via HeliconTech blog)

In the previous articles we told you what cache is and how it works in Helicon Ape. Now it’s time to use obtained knowledge in practice. Today we gonna apply caching for PHP application called qdig that helps organize images web-gallery. Read how to register PHP on IIS7 in our article about WordPress.

Creating online photo album

Let’s create photos folder in site root and fill it with our photos. Now we are downloading qdig. To make it simpler we’ll extract only one index.php file and put it into the same directory.

Photos directory

The gallery is already working: http://localhost/photos/index.php

qdig working

Measuring performance

To measure request rate we’ll use ab.exe application:

ab.exe -n 200 -c 2 "http://localhost/photos/index.php?Qwd=.&Qif=DSC00410.JPG&Qiv=name&Qis=M"

The result is a bit more than 16 requests per second.

003-speed-without-cache

Switching on mod_cache and mod_expires

To enable necessary modules, let’s uncomment the following lines in Helicon Ape httpd.conf file:

LoadModule expires_module    modules/mod_expires.so
LoadModule cache_module      modules/mod_cache.so

Enabling modules in Helicon Manager

Analyzing cached request

To make mod_cache cache not all requests but only unique ones, let’s figure out what qdig request parameters mean and how request uniqueness depends on them:

  • Qwd – folder where image files reside – AFFECTS request uniqueness;
  • Qif – file name – AFFECTS request uniqueness;
  • Qiv – mode of file names representation - AFFECTS  request uniqueness;
  • Qis – image size – DOESN’T AFFECT request uniqueness;
  • Qtmp – representation mode - DOESN’T AFFECT request uniqueness;

Thus, cache key will use only Qwd, Qif and Qiv parameters.

The piece of config for mod_cache will look like:

<Files index.php>

    CacheEnable mem
    CacheVaryByParams Qwd Qif Qiv

</Files>

Expiration time

index.php script does not set Cache-Control and Expires headers, but, as we already know, they are really important for successful caching. So we’ll set these headers by ourselves. And for that purpose we’ll use mod_expires functionality:

ExpiresActive On
ExpiresByType text/html "access 1 hour"

Above directives set expiration time to 1 hour.

The resulting .htaccess is as follows:

004-enable-modules

Measuring performance once again

ab.exe -n 200 -c 2 "http://localhost/photos/index.php?Qwd=.&Qif=DSC00410.JPG&Qiv=name&Qis=M"

And now the result is about 94 requests per second!

Speed with mod_cache enabled

That’s all you need to do to achieve sixfold performance growth.

This example  clearly demonstrates the ease and efficiency of Helicon Ape caching feature.

How mod_cache works?

January 12, 2009 by rukeba

(via HeliconTech blog)
Helicon Ape release (coming very-very soon) will contain mod_cache module. And as we promised in our previous article we are now giving you more thorough description of mod_cache operation.

mod_cache starts working

After authentication/authorization events but prior to request handler execution mod_cache comes out on the scene.
At this stage the module performs the following:

  • checks whether it’s possible to use cached response for the current request
  • if yes, generates a key and searches cached response using this key
  • if the response is found in cache, the module gives it back to the client and request processing is over — request handler is not invoked.

Cacheable or not cacheable: request check

request cacheabilty check

Response may be cached if request meets the following requirements:

  • request method is GET
  • request does not contain Authorization header
  • Cache-Control request header must not be no-cache. This condition is ignored if CacheIgnoreCacheControl On is used
  • Pragma request header must not be no-cache. This condition is ignored if CacheIgnoreCacheControl On is used

mod_cache attempts to save response

When request handler has completed its job and all defined filters have been applied to response, mod_cache starts to operate. At this stage the module performs the following:

  • estimates the capability of response caching
  • checks if CacheEnable is set for this request
  • generates cache key
  • defines the period of time to store response in cache (absolute expiration time)
  • saves response in cache according to the key

Cacheable or not cacheable: response check

response cacheability check

The following conditions are considered when deciding whether response is cacheable (all must be met at a time):

  • request method is GET
  • response status is 200 (200, 203, 300, 301 or 410 in Apache)
  • Expires response header contains valid “future” date
  • responses containing expiration time (i.e. Expires or Cache-Control: max-age=XX headers), Etag header or Last-Modified header. This condition is ignored if CacheIgnoreNoLastMod is used
    • if request has a QueryString, only those responses containing expiration time are cached (i.e. Expires or Cache-Control: max-age=XX headers). This condition is ignored if CacheIgnoreQueryString On is used
  • Cache-Control request header must not be no-cache. This condition is ignored if CacheStoreNoStore On is used
  • Cache-Control request header must not be private. This condition is ignored if CacheStorePrivate On is used
  • request does not contain Authorization header (for Apache: if Cache-Control contains s-maxage, must-revalidate or public)
  • Vary response header does not contain “*”.

Cache key generation

Response is saved in cache according to the key. This key includes:

  • normalized (canonical) request URI without QueryString or, in case of proxy request, normalized proxy request URL;
  • all QueryString parameters and their values in alphabetical order (default behavior)
    • CacheIgnoreQueryString On directive cancels addition of request parameters to the cache key
    • CacheVaryByParams param1 param2 ... directive defines parameters to be included into cache key
  • all request headers specified in CacheVaryByHeaders header1 header2 ... directive. Headers are not included to the cache key by default.
  • If response contains Vary header, all request headers specified in it are included into cache key.

When cached response dies

HTTP response is stored in cache for a specific period of time that is computed in the following way:

  • If response contains Expires header and its value is valid and does not refer to the past, cached response will be stored till the time specified in it.
  • If response contains Cache-Control header with either max-age=X or s-maxage=X, cached response will be stored in cache for X seconds.
  • If response contains Last-Modified header, cached response will be stored in cache until:
    expiry date = date + min((date – lastmod) * factor, maxexpire),
    where date – current date,
    lastmod – value of Last-Modified header,
    factor – float value set via CacheLastModifiedFactor directive (default value = 0,1),
    maxexpire – value set via CacheMaxExpire directive (default value = 86400 seconds = 1 day).
  • If mod_cache was unable to calculate expiration date using one of aforementioned methods (this is possible if response doesn’t have Expires, Cache-Control, Last-Modified headers BUT has Etag header), it (date) is equated to default value of 1 hour that may be reset using CacheDefaultExpire directive.

This load of text might look a little unclear for you at a glance, but in reality this is a well-composed and highly efficient scheme. And our upcoming article will convince you in this.

Web Caching: what is it?

January 9, 2009 by rukeba

(via HeliconTech blog)

What is that and what’s it for?

Web cache is a vital instrument to build lightning-fast web apps. Web cache stores HTTP responses that may be provided to the user without making a request to the server, i.e. no ASP/PHP scripts execution and database queries are necessary. And that’s cool!
Web-caching allows to substantially reduce response time — time the server needs to give the response — as reading from cache is much faster than processing request with PHP handler.

Web-caching minimizes traffic — if one uses intermediate caches (gateway or proxy cache), request won’t reach the origin server — response will be given back by an intermediate caching server.

Cache breeds

Server cache

This cache works on the origin server. Applications and server itself use it to store parts of responses (e.g. web pages) or complete responses. Server cache may be used on application (e.g. memcached + php or HttpRuntime.Cache + ASP.NET) or HTTP server level (e.g. mod_cache in Apache, OutputCache in IIS7).

Proxy cache

It lives between clients and origin servers and may only store public representations that do not require authorization (unlike private representations). Proxy cache is widely used by providers to reduce traffic.

Browser cache

It lives in browser and is capable of storing private data. Browser cache is used for example for Back button operation.

How does Server Cache work?

Cacheless configuration

Cacheless configuration forces server to process each incoming request and generate new response even if the same resource is requested several times running. That is senseless time- and resources-consuming operation that puts excessive load on the server.

cacheless web server configuration

First request to cache-enabled server

First request to caching server

When the specific resource is requested from the server for the first time caching system checks if it’s possible to cache the response, then it looks for response in cache and fails to find it. Request moves further along the server pipeline triggering necessary handlers and filters. When the response is ready caching system saves it to cache before sending to the client.

Subsequent requests to cache-enabled server

Subsequent requests to caching web server

Upon further requests to this resource caching system checks if it’s possible to cache the response, then it looks for response in cache and this time finds it! Then the response is retrieved from cache and sent to the client. And that’s it! No server handlers and filters are executed.

Responses are stored in cache for a certain period of time. When this time elapses cached response is labeled
as not valid and is removed from cache. Next request to that same resource is processed as if it is requested from the server for the first time (see “First request to cache-enabled server’).

Conclusion

As you could see, Server Cache favors lower server load and faster response time. In the next article concerning cache we’ll give more thorough explanation of this process and illustrate it with examples.

Guide: Tuning WordPress with Helicon Ape on IIS7 (permalinks, browser/server caching, compression)

December 24, 2008 by rukeba

WordPress is a highly popular and rapidly growing open source CMS providing powerful blog-designing facilities. Accounting for its popularity we decided to write an article in which we’ll show how to optimize, speed up, prettify WordPress operation using Helicon Ape. So, let’s start.

Prerequirements: Windows 2008/Vista, IIS7, FastCGI, Helicon Ape

Step 1. MySQL

After MySQL installation run MySQL Command Line Client and execute the following command:

create database wordpress;

Step 2. PHP

Install PHP into C:inetpubphp to inherit NTFS permissions of IIS directory. It is of importance that C:inetpubphpphp.ini contained the following directives:

magic_quotes_runtime = Off
extension_dir = "C:inetpubphpext"
extension=php_mysql.dll

After ensuring the directives are in place you need to register PHP in IIS configuration. To accomplish this:

  • open IIS Manager
  • open Handler Mappings snap-in
  • press ‘Add Module Mapping’ and fill the fields in the opened window in accordance with the next screenshot (all images are clickable)

Step 3. WordPress

Download the latest WordPress version

Unzip the package into C:inetpubwwwrootwordpress

Rename wp-config-sample.php into wp-config.php.

Adjust MySQL connection:

Create a blog

Now in Administrative panel you can set SEO-friendly format for your links, e.g: /%post_id%/%postname%

Your links will now look like http://localhost/wordpress/2008/uncategorized/hello-world
instead of http://localhost/wordpress/?p=123. If you now attempt to access any of the pages on your blog,

you’ll get 404 Not Found error

And that is defenitely not the desired result!

Step 4. Helicon Ape

It’s time to fix the above inconveniences and accelerate WordPress.

We’ll use the following Helicon Ape modules:

4.1 mod_rewrite

It’ll assist us in handling SEO-friendly URLs.

Open Helicon Ape Manager and uncomment/add the following line in httpd.conf to enable mod_rewrite:

LoadModule rewrite_module  modules/mod_rewrite.so

Now in Helicon Ape Manager browse to WordPress folder and put the following code to .htaccess:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [NC,L]

This piece of code checks whether the requested resource is physically located on the disk and if it doesn’t, performs rewriting to index.php.

Save changes to .htaccess. Request any blog page once again… It works!

Yes, it does… but not the best way… So we are moving on! ‘Cause only the best is good enough:)

4.2 mod_expires: client/proxy content expiration

With this module we’ll adjust browser cache so that the browser doesn’t send excess requests to the server. Go back to httpd.conf in Helicon Ape Manager and uncomment/add the following line:

LoadModule expires_module  modules/mod_expires.so

Subsequent lines (being put to .htaccess) will tell browser cache that images, css and JavaScripts must not be requested from the server (but rather taken from cache) within 1 day after retrieving resource for the first time:

ExpiresActive On
ExpiresByType image/jpeg "access plus 1 days"
ExpiresByType image/gif "access plus 1 days"
ExpiresByType text/css   "access plus 1 days"
ExpiresByType application/x-javascript  "access plus 1 days"

Let’s check what’s changed. The first request grabs all page resources and for each image, css and JavaScript sets the header Cache-Control: max-age=86400

Upon consequent requests to that page the browser does not ask for images, css and JavaScripts from the server but takes them from the cache until max-age expires.

4.3 mod_gzip

Are you enjoying the process? Then let’s implement the next stage. To make things even better, let’s apply on-the-fly compression that will reduce traffic and speed up page load. In our irreplaceable Helicon Ape Manager please uncomment/add the following line inside httpd.conf:

LoadModule gzip_module  modules/mod_gzip.so

And put the following lines into .htaccess in WordPress folder:

mod_gzip_on yes
mod_gzip_item_include Mime ^text/.*

These directives instruct Helicon Ape to compress only those resources which type starts with “text/”, i.e. all text, .html and .css files. Usually images and video compression is useless.

Uncompressed page had Content-Length: 5152, the same page after compression became as small as Content-Length: 2168. So, we’ve managed to reduce the amount of info to be transferred almost twice.

4.4 mod_cache

We are coming to the final and in conclusion we’ll enable server-side caching. Instead of requesting the same page from the server again and again Helicon Ape saves the copy of server response and fires with it when corresponding page is needed.

Uncomment/add the following line in httpd.conf (using Helicon Ape Manager):

LoadModule cache_module  modules/mod_cache.so

We’ll only cache index.php page. It will be stored in cache for 30 seconds so that the site looked dynamic. There’s no need to cache static files (.html, .css, images, Flash, videos) because IIS processes them really fast.

Note! It is worth mentioning that cache distinguishes query string parameters, thus it will store different snapshots for different blog pages.

There are some shortcomings of this type of caching (semi-dynamic web application). E.g. if the user posts a comment on the blog, it will see the result (his post) immediately, but another user browsing this page will only see it in 30 seconds (after cache expires). This issue may be resolved by reducing the time snapshot lives in cache, but it should nevertheless be taken into account.

Put the following lines into .htaccess in WordPress folder:

<Files index.php>
ExpiresByType "text/html; charset=UTF-8" A30
CacheEnable mem
</Files>

Let us compare the productivity with and without cache. On our testing virtual machine we’ve obtained the following values:

  • without cache – 7.45 requests per second

  • with cache – 711 requests per second!

4.5 mod_headers

The last drop we want to add to our WordPress-based dish will hide IIS from robots and scanners.

You need to uncomment/add the following line in httpd.conf:

LoadModule headers_module  modules/mod_headers.so

And put the only line into .htaccess:

Header set Server "Apache/2.2.9 (Unix)"

Now our IIS appears to the world like

Congratulations to everyone who came to finish! It’s all done now! Enjoy flawless and fast WordPress operation and stay with us!

Best regards,
Helicon Team.

Helicon Ape 1.0.0.12 – firm step ahead!

December 22, 2008 by rukeba

Pretty much time has elapsed since the release of the first public build of Helicon Ape (1.0.0.10). We’ve made every effort to spend this time as efficiently as possible to make you a small Christmas gift. And here it is ready to rejoice you!

The main improvements in build 12 include:

  • check of available updates
  • kernel cache support
  • password generator for basic/digest authentication
  • NE, ENV flags processing in mod_rewrite

Here’s the complete change log.

Guide: URL-rewriting basics and map-files application

December 19, 2008 by rukeba

Lack of understanding of basic URL-rewriting concepts often leads to the problems with rules-writing. So we decided to give a brief and simple explanation of some general concepts.

URL-rewriting allows to substitute real (often ugly) URLs with pretty ones and expose them to users as well as to search engines. The idea is that the user requests for example http://www.site.com/pretty_file.htm (this link is indexed by search engines) and in reality browser shows the content of say http://www.site.com/index.aspx?id=123 (that is real physical file on your server).

Regular expressions empower you to create more complex and efficient rules and add conditions to gain better flexibility and performance.

There are several ways of writing rules; the choice depends on a specific situation.For example if you have these pages:

Real URLs            Rewritten (pretty) URLS
/index.php?q=444 => /page.html
/index.php?q=345 => /another-page.html
/index.php?q=999 => /about.html

You may EITHER use the following rules to implement rewriting functionality:

RewriteRule ^/page.html$         /index.php?q=444 [NC,L]
RewriteRule ^/another-page.html$ /index.php?q=345 [NC,L]
RewriteRule ^/about.html$        /index.php?q=999 [NC,L]

OR
you may use map-files (which are preferable in this case):

Info: Map-file is a .txt file containing pairs of values written in two columns (as shown below). The first (left) column represents the value to which RewriteRule matching result will be compared, and the corresponding value in right column represents the result that will be placed into the substitution URL.

In our example we’ll create a text document (e.g. map.txt) in the folder with .htaccess and put in the following:

page           444
another-page   345
about          999

And our configuration file (.htaccess) will have the following look:

# Set a variable (“map”) to access map.txt from config
RewriteMap map txt:map.txt

# Use tolower function to convert string to lowercase
RewriteMap lower int:tolower

# Get requested file name
RewriteCond %{REQUEST_URI} ^/([^/.]+).html$ [NC]

# Seek file name in map-file
RewriteCond ${map:${lower:%1}|NOT_FOUND} !NOT_FOUND

# Perform rewriting if the record was found in map-file
RewriteRule .? /index.php?q=${map:${lower:%1}} [NC,L]

Helicon Ape Manager

Note! Map files are case-SENSITIVE. So, “Page” will not match “page”. That is why it is advisable to use tolower function that converts matched part to lowercase before comparing it with map-file entries. Don’t forget that in this case all map-file records should also be lowercase.

Map-files are particularly advantageous when you have to rewrite loads of URLs of the similar pattern. The first benefit is that map-file may have virtually unlimited size (up to several gigabytes); secondly, parsing of a large map-file is much faster than processing of a huge .htaccess (or httpd.conf). We don’t recommend using configuration files with more than 100-150 rules.

Hope this info made in easier for you to grasp the idea of URL-rewriting.

Looking forward to your comments and suggestions.

Sincerely Yours,
HeliconTech Team.

Guide: How to enable mod_rewrite in Helicon Ape

December 18, 2008 by rukeba

Hope you’ve already installed Helicon Ape and are ready to have a look at it. So, let’s start!

Run Helicon Ape Manager from Start – Programs – Helicon – Helicon Ape - Helicon Ape Manager.

To enable mod_rewrite in Helicon Ape you should simply add/uncomment the following derective in httpd.conf:

LoadModule rewrite_module  modules/mod_rewrite.so

Helicon Ape Manager

1. The easiest way to check if mod_rewrite is working correctly

Now to make sure mod_rewrite operates correctly add the following code into .htaccess in the root of your site:

RewriteRule . – [F]

Note! Don’t use this directive on live server!

Helicon Ape Manager

Now save changes to .htaccess

Helicon Ape Manager

Make any request to your site. If mod_rewrite works fine, the result will be 403 Forbidden, otherwise you’ll get 404 Page not found.

Hurrah! It works! Now let’s move on to the real example.

2. Example of basic rewrite rule

Let’s create a file, say test.asp, in the root of your site. And put the following code that will show us current date/time

Done? Then put the following rules into the .htaccess in the root of your site

RewriteEngine On
RewriteBase /
RewriteRule ^foo.htm$ test.asp [NC,L]

Save .htaccess.

If you now request http://localhost/foo.htm, you’ll get the content of http://localhost/test.asp.

May your first steps with Helicon Ape be firm and sensible.

Sincerely Yours,
Helicon Team.

How we created Helicon Ape logo:)

December 16, 2008 by rukeba