Load balancing is the process by which inbound internet protocol
(IP) traffic can be distributed across multiple servers. Load balancing
enhances the performance of the servers, leads to their optimal
utilization and ensures that no single server is overwhelmed. Load
balancing is particularly important for busy networks, where it is
difficult to predict the number of requests that will be issued to a
server.
Typically, two or more web serves are employed in a load balancing
scheme. In case one of the servers begins to get overloaded, the
requests are forwarded to another server. Load balancing brings down the
service time by allowing multiple servers to handle the requests. This
service time is reduced by using a load balancer to identify which
server has the appropriate availability to receive the traffic.
The process, very generally, is straightforward. A webpage request is
sent to the load balancer, which forwards the request to one of the
servers. That server responds back to the load balancer, which in turn
sends the request on to the end user.
An important issue when operating a load-balanced service is how to
handle information that must be kept across the multiple requests in a
user's session. If this information is stored locally on one backend
server, then subsequent requests going to different backend servers
would not be able to find it. This might be cached information that can
be recomputed, in which case load-balancing a request to a different
backend server just introduces a performance issue.
One solution to the session data issue is to send all requests in a
user session consistently to the same backend server. This is known as
persistence or stickiness. A significant downside to this technique is
its lack of automatic failover: if a backend server goes down, its
per-session information becomes inaccessible, and any sessions depending
on it are lost. The same problem is usually relevant to central
database servers; even if web servers are "stateless" and not "sticky",
the central database is (see below).
Assignment to a particular server might be based on a username, client
IP address, or by random assignment. Because of changes of the client's
perceived address resulting from DHCP, network address translation, and
web proxies this method may be unreliable. Random assignments must be
remembered by the load balancer, which creates a burden on storage. If
the load balancer is replaced or fails, this information may be lost,
and assignments may need to be deleted after a timeout period or during
periods of high load to avoid exceeding the space available for the
assignment table. The random assignment method also requires that
clients maintain some state, which can be a problem, for example when a
web browser has disabled storage of cookies. Sophisticated load
balancers use multiple persistence techniques to avoid some of the
shortcomings of any one method.
Another solution is to keep the per-session data in a database.
Generally this is bad for performance since it increases the load on the
database: the database is best used to store information less transient
than per-session data. To prevent a database from becoming a single
point of failure, and to improve scalability, the database is often
replicated across multiple machines, and load balancing is used to
spread the query load across those replicas. Microsoft's ASP.net State
Server technology is an example of a session database. All servers in a
web farm store their session data on State Server and any server in the
farm can retrieve the data.
Fortunately there are more efficient approaches. In the very common
case where the client is a web browser, per-session data can be stored
in the browser itself. One technique is to use a browser cookie,
suitably time-stamped and encrypted. Another is URL rewriting. Storing
session data on the client is generally the preferred solution: then the
load balancer is free to pick any backend server to handle a request.
However, this method of state-data handling is not really suitable for
some complex business logic scenarios, where session state payload is
very big or recomputing it with every request on a server is not
feasible, and URL rewriting has major security issues, since the
end-user can easily alter the submitted URL and thus change session
streams. Encrypted client side cookies are arguably just as insecure
since unless all transmission is over HTTPS, they are very easy to copy
or decrypt for man-in-the-middle attacks.
A variety of scheduling algorithms are used by load balancers to
determine which backend server to send a request to. Simple algorithms
include random choice or round robin. More sophisticated load balancers
may take into account additional factors, such as a server's reported
load, recent response times, up/down status (determined by a monitoring
poll of some kind), number of active connections, geographic location,
capabilities, or how much traffic it has recently been assigned.
High-performance systems may use multiple layers of load balancing.
Load balancing of servers by an IP sprayer can be implemented in different ways. These methods of load balancing can be set up in the load balancer based on available load balancing types. There are various algorithms used to distribute the load among the available servers.
In a random allocation, the HTTP requests are assigned to any server picked randomly among the group of servers. In such a case, one of the servers may be assigned many more requests to process, while the other servers are sitting idle. However, on average, each server gets its share of the load due to the random selection.
In a round-robin algorithm, the IP sprayer assigns the requests to a list of the servers on a rotating basis. The first request is allocated to a server picked randomly from the group, so that if more than one IP sprayer is involved, not all the first requests go to the same server. For the subsequent requests, the IP sprayer follows the circular order to redirect the request. Once a server is assigned a request, the server is moved to the end of the list. This keeps the servers equally assigned.
Weighted Round-Robin is an advanced version of the round-robin that eliminates the deficiencies of the plain round robin algorithm. In case of a weighted round-robin, one can assign a weight to each server in the group so that if one server is capable of handling twice as much load as the other, the powerful server gets a weight of 2. In such cases, the IP sprayer will assign two requests to the powerful server for each request assigned to the weaker one.
Server health checking is the ability of the load balancer to run a
test against the servers to determine if they are providing service.
A Web cache is a temporary storage place for files requested
from the Internet. After an original request for data has been
successfully fulfilled, and that data has been stored in the cache,
further requests for those files (a Web page complete with images, for
example) results in the information being returned from the cache rather
than the original location.
Caching is useful for any library. Faster response to users'
requests and saved bandwidth are never a bad thing. Caching really makes
sense for libraries that feel they must purchase more bandwidth to keep
up with increased usage. In such cases a cache server or cache
appliance could very likely lower the demand on the existing bandwidth,
thus make a costly bandwidth upgrade unnecessary. Suppose an upgrade
from a 256K data circuit to a full T1 will increase your monthly
Internet bill by $500 . A $3,000 investment in caching (and caching
solutions often can be implemented for much less than that ) will start
to show a return after six months.
Web sites are continually updating their content. News
headlines change, stock quotes change, weather changes. It may seem that
caching is not worthwhile if it is returning dated material. A traffic
report that is two hours old doesn't do you much good. Fortunately there
are checks and balances in place to ensure that the content you are
viewing is current.
So how does your cache know what to hang on to and what to let go?
That depends on choices made by the Web developer as well as the way the
user configures his cache. As mentioned above, Web sites are made up of
individual pieces. Each one of these pieces is encoded with specific
information that will tell your cache how to handle it. This information
may specify, “Don't cache this item,” in which case the cache will
ignore it. The item may have a “max age” specified. This tells the cache
that after a set amount of time the cache must check in with the Web
site for newer versions of that object. The “expires” field serves
roughly the same purpose. The item might also have a “last modified”
field. Last modified is another way for your cache to ask the Web server
if the object has been modified since your last visit. If it has, the
cache gets a new copy, if not the cache just hangs on to the copy it
already has. The Web site administrator controls each of these items.
There are many cache products available. Each has lots of different
configuration options to help ensure that your data is current. Caches
can be configured to accept all, some, or none of the priorities that
the Web site administrator sets.
A cache appliance is a hardware and software caching solution all in
one unit. A cache server is a software-only solution. The software is
installed on an existing server. Unlike a browser cache that only
benefits one user, cache servers or appliances are shared and benefit
every user in the network. The cache server/appliance sits on the Local
Area Network. A reverse proxy cache usually has a load balancing web
cache for incoming requests.
Forward/transparent proxy servers, reverse proxy servers (which are actually what the cache appliances are running internally) and web servers mostly have web caches. The caches in web servers are RAM caches as they already have the resources served locally. The caches on proxy servers could be RAM & DISK, usually both. It is hgly recommended to install a 15k RPM or SSD HDD as proxy server DISK cache.
Secure socket layer (SSL) certificates provide authentication
between a server and a client computer in a Web application. Companies
or businesses with a dedicated SSL certificate must host that
certificate on a Web server. Heavy use of the certificate can put a
strain on the machine and slow down the application.
SSL offloading takes all the processing of SSL encryption and
decryption off the main Web server and moves it to a separate device
designed specifically for the task. This allows the performance of the
main Web server to increase and it handles the SSL certificate
efficiently.
SSL offloading increases the effectiveness of the security
offered by the certificates because the designated device can devote
more processing time to warding off attacks. It increases the Website
and application speed and prevents companies from needing to add more
Web servers to keep up with the demands of a frequently used SSL
certificate.
SSL termination performs decryption on the designated device,
then sends the unencrypted data to the main Web server. This data passes
through extra security measures such as an intrusion detection system
and a firewall to protect the transmission of unencrypted data. SSL
bridging decrypts and checks the data for malicious code before it
reaches the server. It then re-encrypts it and processes it again after
the server redirects it to the designated device. The extra step slows
down the process.
There's a finite amount of bandwidth on most Internet
connections, and anything administrators can do to speed up the process
is worthwhile. One way to do this is via HTTP compression, a capability
built into both browsers and servers that can dramatically improve site
performance by reducing the amount of time required to transfer data
between the server and the client. The principles are nothing new — the
data is simply compressed. What is unique is that compression is done on
the fly, straight from the server to the client, and often without
users knowing.
HTTP compression is easy to enable and requires no client-side
configuration to obtain benefits, making it a very easy way to get extra
performance. This article discusses how it works, its advantages, and
how to configure Apache and IIS to compress data on the fly.
Most user's knowledge of compression is from compressing a group of
files that they download, extract, and open. But compression can also be
used passively to compress documents as they are being transferred to a
client's browser. Because it's a passive process, the server can reduce
the size of the pages sent, therefore reducing the download time for
users and their bandwidth usage.
Working the numbers helps clarify the gains. You can typically reduce
an HTML document to less than half of its original size. This, in turn,
halves the amount of time the client needs to download the page as well
as the amount of bandwidth required. All of this is achieved without
actually changing the way the site works, its page layout, or the
content. The only thing that changes is the way the information is
transferred.
Unfortunately, there are limitations.
Not all files are suitable for compression. For obvious reasons,
files that are already compressed, such as JPEGs, GIFs, PNGs, movies,
and 'bundled content (e.g., Zip, Gzip, and bzip2 files) are not going to
compress appreciably further with a simple HTTP compression filter.
Therefore, you are not going to get much benefit from compressing these
files or a site that relies heavily on them.
However, sites that have a lot of plain text content, including the
main HTML files, XML, CSS, and RSS, may benefit from the compression. It
will still depend largely on the content of the file; most standard
HTML text files will compress by about a half, sometimes more. Heavily
formatted pages, for example those that make heavy use of tables (and
therefore repetitive formatting content) may compress even further,
sometime to as little as one-third of the original size.
A 2009 article by Google engineers Arvind Jain and Jason
Glasgow states that more than 99 person-years are wasted daily due to
page load time increases when users do not receive compressed content.
This occurs where anti-virus software interferes with connections to
force them to uncompressed, where proxies are used (with overcautious
web browsers), where servers are misconfigured, and where browser bugs
stop compression being used. Internet Explorer 6, which drops to HTTP
1.0 (without features like compression or pipelining) when behind a
proxy- a common configuration in corporate environments- was the
mainstream browser most prone to failing back to uncompressed HTTP.
URL translation is translation from externally known URLs to the internal locations
URL Rewriting is a server-side technique for mapping URL requests to request handlers.
Typically there is a direct mapping between request URL and the
handler for that request. All requests that end in .php will be handled
by a PHP script with the given name. Similarly, request paths that end
in .html will typically be handled by a static file handler. The mapping
between URL and handler is typically static, and depends solely on the
"extension" of the URL Request.
URL Rewriting allows administrators to more flexibly map between the
incoming requests and the actual resource that handles the request on
the server. For example, using URL Rewriting, requests that have a .html
extension could be served by ASP.NET, or requests that have no
extension could be served by a PHP script.
Most URL Rewriters match the incoming URL against a set of patterns,
and rewrite the URL according to which patterns match. The language used
in the most powerful and flexible rewriters to describe the patterns is
known as Regular Expressions. Some rewriters also allow rewriting based
on other factors, including the request headers, Server variables, and
even the state of the server filesystem.
Many URL Rewriters can also redirect requests. This has led to
confusion in the terms Rewrite and Redirect. To learn the difference,
see Redirecting versus Rewriting. IIRF can perform URL redirects as well
as URL rewrites.
Search Engine Optimization (SEO)
SEO is a broad topic, but the main goal is to assist search
engines in finding content on a web site. One aspect of that is
optimizing the URLs themselves.
Making user-friendly URLs
Similar in effect to SEO, this allows the use of friendly public
URLs where they are observed by users in links and browser bars.
Elements within URLs that are meaningful only to server-side technology,
including the extension of the server-side script or web app platform,
can be obscured from the public.
Faking people out
In some cases the web administrator would like to conceal the
server-side technology that is being used. URL Rewriting allows, for
example, a public URL that ends in .jsp to be handled by a .php script.
Routing requests
You can force certain requests to use a secure connection (https), or a particular server.
Server-side technology migrations
When migrating from one technology to another in stages, URL
rewriting can be used to keep the URL space stable while things change
on the server back-end. URL Rewriting can also be used to support
migration of "old" or stale URLs to the new URL namespace, when those
changes occur.
Injecting custom processing
In some cases, a server administrator may wish to inject new,
additional, server-side processing for well-known existing URLs. One
example here is inserting special image handling logic behind a .jpg
URL. You may wish to block access to image URLs from outside referrers,
to limit bandwidth leaching.
Filtering URL requests
An administrator may want to restrict access to certain URLs based on the referer, the requesting IP address, and so on.
ServeTrue IQProxy -- Reverse, forward and transparent proxy server for Windows.