So I’m setting up a low-budget production use web based application. This means web servers, database servers, file servers with the smallest possible footprint, the lowest possible cost, and the greatest level of reliability. These can be tricky things to manage and while there are plenty of solutions available finding one that fits your needs may prove to be difficult.
My first philosophy in this process is “passing the buck;” I will explain how this can be done throughout this posting. No matter what your solution things to think about when hosting your own services are bandwidth, address space, environment, equipment, and software. I have the option of maintaining my own servers, for a significantly lower monetary cost in the long run, but I personally cannot afford the time required to maintain those servers and ensure uptime. A more feasible solution might be to go with one or more VPS hosting solutions such as SliceHost or LiNode. These options often allow quick scaling of hardware for increased load capacity and the hosting provider becomes responsible for the maintenance and availability of the hardware. At a minimum this solution passes environment and hardware maintenance responsibility to the hosting company (note that you are still responsible for monitoring and maintaining hardware requirements).
The easiest barrier to overcome in hosting is software; most people can access open sourced LAMP stacks and set them up to provide a generally reliable web host with little hassle. For my purposes I will be using Ubuntu, a mid-grade Linux project based on the Debian distribution as the OS. For more reliable and better supported application stacks you might consider a paid subscription such as RedHat Linux. Many VPS hosting solutions will provide access to these common distributions specifically configured for their hardware as part of your hosting fees.
In LAMP (Linux, Apache, MySQL, PHP) stacks servers are configured with an Apache web server and the PHP scripting language is used in site development. These are not the only potential software packages available for use; many sites will use the Ruby (commonly used as part of the Ruby On Rails application framework) scripting language and the WEBrick web server for example. Finding a scripting language that you are comfortable with and can find documentation for is important and often the first step in selecting your software stack. For my purposes I will use Apache 2.0 with PHP 5. While setting up Apache I must consider the different configurations such as mpm_prefork (usually requires more memory but provides a stable environment, supports PHP5 without CGI) and mpm_worker (more efficient use of memory and multiple CPU’s, requires CGI to support PHP5). Read more about mpm_prefork vs mpm_worker at http://httpd.apache.org/docs/2.0/mpm.html. No matter what installation and configuration you use be sure to properly research the installation process and operating limitations before beginning your install; it can sometimes be difficult or impossible to make changes down the road.
Depending upon your comfort level with configuring the software stack you may chose to go with a full hosting service provider such as DreamHost or MediaTemple. When selecting a hosting provider you must carefully consider availability requirements, cost, and services. Many sites that promise low cost with unlimited bandwidth limits and unlimited disk usage experience low uptime because they cannot support the product that they are advertising. Some larger hosts experience downtime due to DDoS attacks provoked by their size and image on the web. When looking at potential hosts use third party sites such as besthostratings.com, search for user reviews, and check out their support sites for recurring problems. Keep in mind that many of these hosting providers offer little flexibility in their software stack and for the purposes of maintaining server integrity will often not allow users to run custom environments to include helper packages such as ImageMagick. For this reason I will often sacrifice my “passing the buck” philosophy.
When hosting your own servers or when using a hosting provider you will often be faced with hardware decisions that will affect your overall performance. Two factors which are most likely to affect performance are memory and CPU usage. You should take into consideration the amount of memory that your web service application is using and how much system memory is available. A web server should never run using swap space because system memory is exhausted. Ensure that the total memory available exceeds the amount of memory required by the sum of the average number of processes used by your web server times the number of processes used during peak usage. Tweaking your web server to work within the physical limitations of your machine is important to provide quality of user experience, tweaking your machine to allow for the maximum number of users is important for the quality of your site or web service. Commands such as ‘free’ and ‘top’ can be used to determine memory usage on most Linux hosts. CPU cycles are often only affected when large amounts of processing are required for a large number of users such as image and video processing and streaming. Your memory requirements will often be exhausted before you reach your CPU limit.
Bandwidth and disk usage requirements are also hardware issues to be considered. Sites that support large images and large numbers of users often experience issues with bandwidth and disk space. These issues can often be reduced by using software solutions to compress traffic on the web server and by converting images to compressed formats no larger than the largest display size on site. Compression may help reduce bandwidth requirements but it is important to remember that this will also tax your CPU and memory usage during compression and decompression.
In my situation I found that not one but a combination of all solutions may provide the best answer. By using multiple VPS solutions with mirrored configurations I can set up failover DNS that will switch between the two services in the event that a service goes offline due to system failure or exceeded limitations providing me with the time needed to recover from the situation and make necessary adjustments. One such failover service is DNS Made Easy. Because these VPS hosts are reliable I can use them to support the bulk of my application stack however they are limited by bandwidth and disk space. For this using multiple accounts on a full hosting provider with lower anticipated availability but much higher limits for bandwidth and disk usage provides file storage for non-application files such as images, videos, and other media uploaded by site users. By using multiple accounts and failover configurations you can provide some measure of safeguard against reduced availability; by mounting the remote service directly to the web server or by using sub-domains you can provide secure distribution of those files. In the event that the hosting service does fail your application will continue to run on the VPS hosting providers. Be sure that your hosting provider allows for this sort of use as they may shut you down for improper service usage, also note that you are doubling your bandwidth usage on your web server if you decide to mount the hosting provider directly to the web server because the file must travel from the file host to the web server before reaching the client. Finally using personal servers for non-mission critical, low usage services such as development and system monitoring provides you with an inexpensive way to maintain these services.
Whatever your decision and design keep in mind balance between quality of service and cost because in the end they are directly proportional.