<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://www.orcaware.com/svn/mediawiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dchristian</id>
	<title>SubversionWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://www.orcaware.com/svn/mediawiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dchristian"/>
	<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/wiki/Special:Contributions/Dchristian"/>
	<updated>2026-04-20T08:39:11Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1751</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1751"/>
		<updated>2008-07-15T20:17:14Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: /* Watch your entropy */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General notes ==&lt;br /&gt;
&lt;br /&gt;
There are several good web sites and books about how to setup subversion, but I couldn&#039;t find anything about how to optimize performance.  This is a guide to understanding and improving the performance of subversion when it is served using svnserve or HTTP (via apache).&lt;br /&gt;
&lt;br /&gt;
Much of the operating system tuning is similar to what would be done for a database or e-mail server.  It can be useful to search those areas for additional advice http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
These notes have a lot of Linux specific details, but the concepts should apply to most Unix based systems.  Feel free to add details for other flavors of Unix here.  Non-posix tuning notes should probably go on another page.&lt;br /&gt;
&lt;br /&gt;
The repository can be stored as either a Berkeley DB database or in the FSFS repository formats.  This is hidden from users.  FSFS is generally considered to be faster. &lt;br /&gt;
&lt;br /&gt;
== Making reads cheaper ==&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a performance problem.  When filesystem semantics were being defined, it seemed like a good idea to know when a file was last accessed.  The down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilities use this information (e.g. tmpwatch and mail), but subversion never uses atime.  Subversion performance is improved by avoiding the access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount options.  On Linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a NFS filesystem, the atime recording happens on the server and must be disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20 and mount-2.13.  This eliminates most atime writes without breaking the few utilities that need it.  This is most useful if the repository must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
== Making writes cheaper ==&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix operating systems) to tell the operating system to write data to disk.  Up until that point, the data is usually only memory and the operating system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to guarantee that everything it said had been done would be there when the machine re-boots.  Waiting for data to write out to disk is often the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the bargain.  On Linux, fsync() only ensures that the data is on its way to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the drive, then it doesn&#039;t actually wait for the data to hit the disk platter before returning.  This means there is a window of time that a power loss can cause the disk state to not match what subversion returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID controller with a battery backed write cache.  The cache is treated as part of the disk system.  As soon as the data is in the cache, the fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the RAID controller will finish writing out the cache when power is restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no latency from head movement or waiting for the disk to rotate.  This becomes more significant when writing many small files (like many FSFS writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
== Reducing the number of writes ==&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different filesystem than the one holding the repository.  This is valuable when the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
== Reduce directory index size ==&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored in subdirectories that don&#039;t grow past a specified size.  This allows repositories to store many more revisions than can (efficiently) be stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a single directory.  However, performance can suffer as the directory index starts to use multiple levels of indirection.  Some administration tools may also have trouble with very large directories.  Splitting the revision store into sub-directories avoids all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in &amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the repository.  The default is 1000 revisions per subdirectory. Non-sharded repositories can be loaded into a new, sharded, repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Optimize write-once files on NFS ==&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency check is made every time a file is opened.  Since the revision files in a FSFS repository never change, it is worthwhile to skip the cache checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039; option to the mount command (note: the man page claims this is ignored, but it isn&#039;t on linux-2.6).  You need coherency for some files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache coherent mount point to the &#039;nocto&#039; mount for these directories: revs and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
== Increase NFS caching timeout ==&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period of time.  This can be changed by passing the actimeo option to the mount command.  The man page claims the default is 60 (seconds), but some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger (e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
== Distributing CPU load ==&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting differences in state, so the CPU load to calculate the difference can be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load and provide redundancy.  A network load balancer makes all front ends (FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need to ensure that the load balancer keeps an entire transaction on the same FE (to allow transactions to be built up on local disk).  The load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server. You should also configure apache to keep a single TCP connection for the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
== High storage system reliability ==&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of file/directory versions so you can retreive them in the future.  None of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This limits the loss to the changes that happened since the last backup.  If the repository is large and the commit rate is high, it may be impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and you do a backup every hour, you may lose 3600 revisions if the disk fails.  This is a large scale example, but the point is to gather your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID technology.  This allows one (and sometimes more) disks to fail without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you can find various free alternatives by searching for &amp;quot;NFS server high availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage systems and requires a high bandwidth network (e.g. gigabit ethernet).  It reduces performance, but the caching optimizations listed above can help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of the system every few minutes and then transmits the difference between the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It can allow a backup filesystem to be located in another geographic region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.  Setup svnsync to periodically sync a back up server off the main one.  This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs &amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate partition/server.  This creates a transaction log of commits that can be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
== Watch your entropy ==&lt;br /&gt;
&lt;br /&gt;
When servers handled lots of queries, certain protocols can deplete the [http://en.wikipedia.org/wiki/Entropy_(computing) entropy] [http://en.wikipedia.org/wiki/Urandom pool].  The svn:// protocol (served by svnserve) and [http://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer SASL] cyphers read from /dev/random for every new connection.  If the entropy pool becomes depleted, then the service will become very slow.&lt;br /&gt;
&lt;br /&gt;
The pool should have 100+ bits in it for good operation.  You can check the entropy pool size on linux like this:&lt;br /&gt;
  sysctl kernel.random.entropy_avail&lt;br /&gt;
&lt;br /&gt;
This should not be a problem if APR was configured with &amp;quot;--with-devrandom=/dev/urandom&amp;quot;.  Sasl has a similar configuration option (called???).  You may need to check how the packages supplied with you OS are configured.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the SVN benchmarking capabilities in mstone http://mstone.sourceforge.net/)&lt;br /&gt;
&lt;br /&gt;
[[User:Dchristian|Dchristian]] 13:42, 30 May 2008 (PDT)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1750</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1750"/>
		<updated>2008-07-15T18:22:51Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General notes ==&lt;br /&gt;
&lt;br /&gt;
There are several good web sites and books about how to setup subversion, but I couldn&#039;t find anything about how to optimize performance.  This is a guide to understanding and improving the performance of subversion when it is served using svnserve or HTTP (via apache).&lt;br /&gt;
&lt;br /&gt;
Much of the operating system tuning is similar to what would be done for a database or e-mail server.  It can be useful to search those areas for additional advice http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
These notes have a lot of Linux specific details, but the concepts should apply to most Unix based systems.  Feel free to add details for other flavors of Unix here.  Non-posix tuning notes should probably go on another page.&lt;br /&gt;
&lt;br /&gt;
The repository can be stored as either a Berkeley DB database or in the FSFS repository formats.  This is hidden from users.  FSFS is generally considered to be faster. &lt;br /&gt;
&lt;br /&gt;
== Making reads cheaper ==&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a performance problem.  When filesystem semantics were being defined, it seemed like a good idea to know when a file was last accessed.  The down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilities use this information (e.g. tmpwatch and mail), but subversion never uses atime.  Subversion performance is improved by avoiding the access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount options.  On Linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a NFS filesystem, the atime recording happens on the server and must be disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20 and mount-2.13.  This eliminates most atime writes without breaking the few utilities that need it.  This is most useful if the repository must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
== Making writes cheaper ==&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix operating systems) to tell the operating system to write data to disk.  Up until that point, the data is usually only memory and the operating system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to guarantee that everything it said had been done would be there when the machine re-boots.  Waiting for data to write out to disk is often the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the bargain.  On Linux, fsync() only ensures that the data is on its way to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the drive, then it doesn&#039;t actually wait for the data to hit the disk platter before returning.  This means there is a window of time that a power loss can cause the disk state to not match what subversion returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID controller with a battery backed write cache.  The cache is treated as part of the disk system.  As soon as the data is in the cache, the fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the RAID controller will finish writing out the cache when power is restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no latency from head movement or waiting for the disk to rotate.  This becomes more significant when writing many small files (like many FSFS writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
== Reducing the number of writes ==&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different filesystem than the one holding the repository.  This is valuable when the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
== Reduce directory index size ==&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored in subdirectories that don&#039;t grow past a specified size.  This allows repositories to store many more revisions than can (efficiently) be stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a single directory.  However, performance can suffer as the directory index starts to use multiple levels of indirection.  Some administration tools may also have trouble with very large directories.  Splitting the revision store into sub-directories avoids all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in &amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the repository.  The default is 1000 revisions per subdirectory. Non-sharded repositories can be loaded into a new, sharded, repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Optimize write-once files on NFS ==&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency check is made every time a file is opened.  Since the revision files in a FSFS repository never change, it is worthwhile to skip the cache checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039; option to the mount command (note: the man page claims this is ignored, but it isn&#039;t on linux-2.6).  You need coherency for some files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache coherent mount point to the &#039;nocto&#039; mount for these directories: revs and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
== Increase NFS caching timeout ==&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period of time.  This can be changed by passing the actimeo option to the mount command.  The man page claims the default is 60 (seconds), but some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger (e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
== Distributing CPU load ==&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting differences in state, so the CPU load to calculate the difference can be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load and provide redundancy.  A network load balancer makes all front ends (FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need to ensure that the load balancer keeps an entire transaction on the same FE (to allow transactions to be built up on local disk).  The load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server. You should also configure apache to keep a single TCP connection for the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
== High storage system reliability ==&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of file/directory versions so you can retreive them in the future.  None of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This limits the loss to the changes that happened since the last backup.  If the repository is large and the commit rate is high, it may be impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and you do a backup every hour, you may lose 3600 revisions if the disk fails.  This is a large scale example, but the point is to gather your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID technology.  This allows one (and sometimes more) disks to fail without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you can find various free alternatives by searching for &amp;quot;NFS server high availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage systems and requires a high bandwidth network (e.g. gigabit ethernet).  It reduces performance, but the caching optimizations listed above can help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of the system every few minutes and then transmits the difference between the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It can allow a backup filesystem to be located in another geographic region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.  Setup svnsync to periodically sync a back up server off the main one.  This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs &amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate partition/server.  This creates a transaction log of commits that can be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
== Watch your entropy ==&lt;br /&gt;
&lt;br /&gt;
When servers handled lots of queries, certain protocols can deplete the [http://en.wikipedia.org/wiki/Entropy_(computing) entropy] [http://en.wikipedia.org/wiki/Urandom pool].  The svn:// protocol (served by svnserve) and [http://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer SASL] cyphers read from /dev/random for every new connection.  If the entropy pool becomes depleted, then the service will become very slow.&lt;br /&gt;
&lt;br /&gt;
The pool should have 100+ bits in it for good operation.  You can check the entropy pool size on linux like this:&lt;br /&gt;
  sysctl kernel.random.entropy_avail&lt;br /&gt;
&lt;br /&gt;
There are patches pending (as of subversion 1.5.0 and apr 1.2.2) to improve this (read from urandom instead of random).  TODO: include bug numbers and track when these ship.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the SVN benchmarking capabilities in mstone http://mstone.sourceforge.net/)&lt;br /&gt;
&lt;br /&gt;
[[User:Dchristian|Dchristian]] 13:42, 30 May 2008 (PDT)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Main_Page&amp;diff=1749</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Main_Page&amp;diff=1749"/>
		<updated>2008-06-04T18:10:32Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: /* HowTos */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;big&amp;gt;&#039;&#039;&#039;Welcome to the Subversion Wiki.&#039;&#039;&#039;&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Got a question about Subversion? ==&lt;br /&gt;
&lt;br /&gt;
Since this wiki seems to lack content at the moment, try the [http://subversion.tigris.org/faq.html Subversion FAQ].&lt;br /&gt;
&lt;br /&gt;
Also, there is [http://subversionary.org another wiki] related to Subversion.  At this time, it contains more content than this wiki.  However unlike this wiki, it doesn&#039;t use the more popular MediaWiki engine, and apprently, due to spam problems, was forced to shut off public editing.&lt;br /&gt;
&lt;br /&gt;
== HowTos ==&lt;br /&gt;
&lt;br /&gt;
* Manage [[System Configurations]] like &amp;lt;tt&amp;gt;/etc&amp;lt;/tt&amp;gt; files using subversion.&lt;br /&gt;
* Setting up [[Subversion configured for Windows Active Directory HTTPS]]&lt;br /&gt;
* Setting up [[Automatic lock-modify-unlock]] for binary files&lt;br /&gt;
* How to do [[Server performance tuning for Linux and Unix]]&lt;br /&gt;
&lt;br /&gt;
== Contrib tools ==&lt;br /&gt;
&lt;br /&gt;
* [[svnmerge.py]] - Automatic branch management with merge tracking support.&lt;br /&gt;
* [[Repository Upgrade]] - Upgrade from an older repository to latest (1.4) (Windows)&lt;br /&gt;
* [[Subclipse]] - The subversion plugin for Eclipse&lt;br /&gt;
* [[SVNKit]] - Pure Java Subversion library&lt;br /&gt;
* [http://svnnotifier.tigris.org SVN Notifier] - Notifies you about other people&#039;s commits to subversion&lt;br /&gt;
&lt;br /&gt;
== MediaWiki help ==&lt;br /&gt;
&lt;br /&gt;
This wiki uses the MediaWiki software. Consult the [http://meta.wikipedia.org/wiki/MediaWiki_User&#039;s_Guide User&#039;s Guide] for information on using the wiki software.&lt;br /&gt;
&lt;br /&gt;
=== Getting started ===&lt;br /&gt;
&lt;br /&gt;
* [http://www.mediawiki.org/wiki/Help:Configuration_settings Configuration settings list]&lt;br /&gt;
* [http://www.mediawiki.org/wiki/Help:FAQ MediaWiki FAQ]&lt;br /&gt;
* [http://mail.wikipedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list]&lt;br /&gt;
* [http://alfredfazio.ws/notes%3Asubversion Fast-paced overview of Subversion for experienced users]&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1748</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1748"/>
		<updated>2008-06-04T18:06:55Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== General notes ==&lt;br /&gt;
&lt;br /&gt;
There are several good web sites and books about how to setup subversion, but I couldn&#039;t find anything about how to optimize performance.  This is a guide to understanding and improving the performance of subversion when it is served using svnserve or HTTP (via apache).&lt;br /&gt;
&lt;br /&gt;
Much of the operating system tuning is similar to what would be done for a database or e-mail server.  It can be useful to search those areas for additional advice http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
These notes have a lot of Linux specific details, but the concepts should apply to most Unix based systems.  Feel free to add details for other flavors of Unix here.  Non-posix tuning notes should probably go on another page.&lt;br /&gt;
&lt;br /&gt;
The repository can be stored as either a Berkeley DB database or in the FSFS repository formats.  This is hidden from users.  FSFS is generally considered to be faster. &lt;br /&gt;
&lt;br /&gt;
== Making reads cheaper ==&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a performance problem.  When filesystem semantics were being defined, it seemed like a good idea to know when a file was last accessed.  The down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilities use this information (e.g. tmpwatch and mail), but subversion never uses atime.  Subversion performance is improved by avoiding the access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount options.  On Linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a NFS filesystem, the atime recording happens on the server and must be disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20 and mount-2.13.  This eliminates most atime writes without breaking the few utilities that need it.  This is most useful if the repository must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
== Making writes cheaper ==&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix operating systems) to tell the operating system to write data to disk.  Up until that point, the data is usually only memory and the operating system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to guarantee that everything it said had been done would be there when the machine re-boots.  Waiting for data to write out to disk is often the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the bargain.  On Linux, fsync() only ensures that the data is on its way to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the drive, then it doesn&#039;t actually wait for the data to hit the disk platter before returning.  This means there is a window of time that a power loss can cause the disk state to not match what subversion returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID controller with a battery backed write cache.  The cache is treated as part of the disk system.  As soon as the data is in the cache, the fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the RAID controller will finish writing out the cache when power is restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no latency from head movement or waiting for the disk to rotate.  This becomes more significant when writing many small files (like many FSFS writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
== Reducing the number of writes ==&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different filesystem than the one holding the repository.  This is valuable when the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
== Reduce directory index size ==&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored in subdirectories that don&#039;t grow past a specified size.  This allows repositories to store many more revisions than can (efficiently) be stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a single directory.  However, performance can suffer as the directory index starts to use multiple levels of indirection.  Some administration tools may also have trouble with very large directories.  Splitting the revision store into sub-directories avoids all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in &amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the repository.  The default is 1000 revisions per subdirectory. Non-sharded repositories can be loaded into a new, sharded, repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== Optimize write-once files on NFS ==&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency check is made every time a file is opened.  Since the revision files in a FSFS repository never change, it is worthwhile to skip the cache checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039; option to the mount command (note: the man page claims this is ignored, but it isn&#039;t on linux-2.6).  You need coherency for some files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache coherent mount point to the &#039;nocto&#039; mount for these directories: revs and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
== Increase NFS caching timeout ==&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period of time.  This can be changed by passing the actimeo option to the mount command.  The man page claims the default is 60 (seconds), but some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger (e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
== Distributing CPU load ==&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting differences in state, so the CPU load to calculate the difference can be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load and provide redundancy.  A network load balancer makes all front ends (FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need to ensure that the load balancer keeps an entire transaction on the same FE (to allow transactions to be built up on local disk).  The load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server. You should also configure apache to keep a single TCP connection for the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
== High storage system reliability ==&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of file/directory versions so you can retreive them in the future.  None of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This limits the loss to the changes that happened since the last backup.  If the repository is large and the commit rate is high, it may be impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and you do a backup every hour, you may lose 3600 revisions if the disk fails.  This is a large scale example, but the point is to gather your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID technology.  This allows one (and sometimes more) disks to fail without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you can find various free alternatives by searching for &amp;quot;NFS server high availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage systems and requires a high bandwidth network (e.g. gigabit ethernet).  It reduces performance, but the caching optimizations listed above can help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of the system every few minutes and then transmits the difference between the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It can allow a backup filesystem to be located in another geographic region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.  Setup svnsync to periodically sync a back up server off the main one.  This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs &amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate partition/server.  This creates a transaction log of commits that can be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the SVN benchmarking capabilites in mstone http://mstone.sourceforge.net/)&lt;br /&gt;
&lt;br /&gt;
[[User:Dchristian|Dchristian]] 13:42, 30 May 2008 (PDT)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=User_talk:Dchristian&amp;diff=1747</id>
		<title>User talk:Dchristian</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=User_talk:Dchristian&amp;diff=1747"/>
		<updated>2008-06-04T17:46:26Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: User talk:Dchristian moved to Server performance tuning for Linux and Unix: Make this widely available&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;#REDIRECT [[Server performance tuning for Linux and Unix]]&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1746</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1746"/>
		<updated>2008-06-04T17:46:26Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: User talk:Dchristian moved to Server performance tuning for Linux and Unix: Make this widely available&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Server performance tuning for Linux/Unix =&lt;br /&gt;
[[User:Dchristian|Dchristian]] 13:42, 30 May 2008 (PDT)&lt;br /&gt;
&lt;br /&gt;
== General notes: ==&lt;br /&gt;
&lt;br /&gt;
Most of these tips apply to both the Berkeley DB and the FSFS&lt;br /&gt;
repository formats.  FSFS is generally considered to be faster.&lt;br /&gt;
Database servers have the same kinds of problems and are a good source&lt;br /&gt;
for disk and operating system (OS) tuning advice.&lt;br /&gt;
http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
These notes have a lot of Linux specific details, but the concepts should apply to most Unix based systems.  Feel free to add details for other flavors of Unix here.  Non-posix performance notes should probably go on another page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Making reads cheaper: ==&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a&lt;br /&gt;
performance problem.  When filesystem sematics were being defined, it&lt;br /&gt;
seemed like a good idea to know when a file was last accessed.  The&lt;br /&gt;
down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilites use this information (e.g. tmpwatch and mail), but subversion&lt;br /&gt;
never uses atime.  Subversion performance is improved by avoiding the&lt;br /&gt;
access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount&lt;br /&gt;
options.  On linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a&lt;br /&gt;
NFS filesystem, the atime recording happens on the server and must be&lt;br /&gt;
disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20&lt;br /&gt;
and mount-2.13.  This eliminates most atime writes without breaking&lt;br /&gt;
the few utilities that need it.  This is most useful if the repository&lt;br /&gt;
must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and&lt;br /&gt;
http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Making writes cheaper: ==&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix&lt;br /&gt;
operating systems) to tell the operating system to write data to disk.&lt;br /&gt;
Up until that point, the data is usually only memory and the operating&lt;br /&gt;
system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to&lt;br /&gt;
guarantee that everything it said had been done would be there when&lt;br /&gt;
the machine re-boots.  Waiting for data to write out to disk is often&lt;br /&gt;
the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the&lt;br /&gt;
bargain.  On Linux, fsync() only ensures that the data is on its way&lt;br /&gt;
to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the&lt;br /&gt;
drive, then it doesn&#039;t actually wait for the data to hit the disk&lt;br /&gt;
platter before returning.  This means there is a window of time that a&lt;br /&gt;
power loss can cause the disk state to not match what subversion&lt;br /&gt;
returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID&lt;br /&gt;
controller with a battery backed write cache.  The cache is treated as&lt;br /&gt;
part of the disk system.  As soon as the data is in the cache, the&lt;br /&gt;
fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the&lt;br /&gt;
RAID controller will finish writing out the cache when power is&lt;br /&gt;
restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no&lt;br /&gt;
latency from head movement or waiting for the disk to rotate.  This&lt;br /&gt;
becomes more significant when writing many small files (like many FSFS&lt;br /&gt;
writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Reducing the number of writes: ==&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different&lt;br /&gt;
filesystem than the one holding the repository.  This is valuable when&lt;br /&gt;
the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Reduce directory index size: ==&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored&lt;br /&gt;
in subdirectories that don&#039;t grow past a specified size.  This allows&lt;br /&gt;
repositories to store many more revisions than can (efficiently) be&lt;br /&gt;
stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a&lt;br /&gt;
single directory.  However, performance can suffer as the directory&lt;br /&gt;
index starts to use multiple levels of indirection.  Some&lt;br /&gt;
administration tools may also have trouble with very large&lt;br /&gt;
directories.  Splitting the revision store into sub-directories avoids&lt;br /&gt;
all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in&lt;br /&gt;
&amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the&lt;br /&gt;
repository.  The default is 1000 revisions per subdirectory.&lt;br /&gt;
Non-sharded repositories can be loaded into a new, sharded,&lt;br /&gt;
repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Make write-once portions of the repository skip NFS cache checks: ==&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency&lt;br /&gt;
check is made every time a file is opened.  Since the revision files&lt;br /&gt;
in a FSFS repository never change, it is worthwhile to skip the cache&lt;br /&gt;
checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039;&lt;br /&gt;
option to the mount command (note: the man page claims this is&lt;br /&gt;
ignored, but it isn&#039;t on linux-2.6).  You need coherency for some&lt;br /&gt;
files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache&lt;br /&gt;
coherent mount point to the &#039;nocto&#039; mount for these directories: revs&lt;br /&gt;
and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
== Increase NFS caching timeout: ==&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period&lt;br /&gt;
of time.  This can be changed by passing the actimeo option to the&lt;br /&gt;
mount command.  The man page claims the default is 60 (seconds), but&lt;br /&gt;
some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger&lt;br /&gt;
(e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Distributing CPU load: ==&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting&lt;br /&gt;
differences in state, so the CPU load to calculate the difference can&lt;br /&gt;
be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load&lt;br /&gt;
and provide redundancy.  A network load balancer makes all front ends&lt;br /&gt;
(FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need&lt;br /&gt;
to ensure that the load balancer keeps an entire transaction on the&lt;br /&gt;
same FE (to allow transactions to be built up on local disk).  The&lt;br /&gt;
load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server.&lt;br /&gt;
You should also configure apache to keep a single TCP connection for&lt;br /&gt;
the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at&lt;br /&gt;
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== High storage system reliability: ==&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of&lt;br /&gt;
file/directory versions so you can retreive them in the future.  None&lt;br /&gt;
of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This&lt;br /&gt;
limits the loss to the changes that happened since the last backup.&lt;br /&gt;
If the repository is large and the commit rate is high, it may be&lt;br /&gt;
impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and&lt;br /&gt;
you do a backup every hour, you may lose 3600 revisions if the disk&lt;br /&gt;
fails.  This is a large scale example, but the point is to gather&lt;br /&gt;
your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID&lt;br /&gt;
technology.  This allows one (and sometimes more) disks to fail&lt;br /&gt;
without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring&lt;br /&gt;
and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you&lt;br /&gt;
can find various free alternatives by searching for &amp;quot;NFS server high&lt;br /&gt;
availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage&lt;br /&gt;
systems and requires a high bandwidth network (e.g. gigabit ethernet).&lt;br /&gt;
It reduces performance, but the caching optimizations listed above can&lt;br /&gt;
help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system&lt;br /&gt;
with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of&lt;br /&gt;
the system every few minutes and then transmits the difference between&lt;br /&gt;
the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It&lt;br /&gt;
can allow a backup filesystem to be located in another geographic&lt;br /&gt;
region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.&lt;br /&gt;
Setup svnsync to periodically sync a back up server off the main one.&lt;br /&gt;
This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs&lt;br /&gt;
&amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate&lt;br /&gt;
partition/server.  This creates a transaction log of commits that can&lt;br /&gt;
be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the benchmarking capabilites in mstone)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1745</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1745"/>
		<updated>2008-05-30T20:42:08Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: /* Server performance tuning for Linux/Unix */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Server performance tuning for Linux/Unix =&lt;br /&gt;
[[User:Dchristian|Dchristian]] 13:42, 30 May 2008 (PDT)&lt;br /&gt;
&lt;br /&gt;
== General notes: ==&lt;br /&gt;
&lt;br /&gt;
Most of these tips apply to both the Berkeley DB and the FSFS&lt;br /&gt;
repository formats.  FSFS is generally considered to be faster.&lt;br /&gt;
Database servers have the same kinds of problems and are a good source&lt;br /&gt;
for disk and operating system (OS) tuning advice.&lt;br /&gt;
http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
These notes have a lot of Linux specific details, but the concepts should apply to most Unix based systems.  Feel free to add details for other flavors of Unix here.  Non-posix performance notes should probably go on another page.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Making reads cheaper: ==&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a&lt;br /&gt;
performance problem.  When filesystem sematics were being defined, it&lt;br /&gt;
seemed like a good idea to know when a file was last accessed.  The&lt;br /&gt;
down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilites use this information (e.g. tmpwatch and mail), but subversion&lt;br /&gt;
never uses atime.  Subversion performance is improved by avoiding the&lt;br /&gt;
access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount&lt;br /&gt;
options.  On linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a&lt;br /&gt;
NFS filesystem, the atime recording happens on the server and must be&lt;br /&gt;
disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20&lt;br /&gt;
and mount-2.13.  This eliminates most atime writes without breaking&lt;br /&gt;
the few utilities that need it.  This is most useful if the repository&lt;br /&gt;
must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and&lt;br /&gt;
http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Making writes cheaper: ==&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix&lt;br /&gt;
operating systems) to tell the operating system to write data to disk.&lt;br /&gt;
Up until that point, the data is usually only memory and the operating&lt;br /&gt;
system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to&lt;br /&gt;
guarantee that everything it said had been done would be there when&lt;br /&gt;
the machine re-boots.  Waiting for data to write out to disk is often&lt;br /&gt;
the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the&lt;br /&gt;
bargain.  On Linux, fsync() only ensures that the data is on its way&lt;br /&gt;
to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the&lt;br /&gt;
drive, then it doesn&#039;t actually wait for the data to hit the disk&lt;br /&gt;
platter before returning.  This means there is a window of time that a&lt;br /&gt;
power loss can cause the disk state to not match what subversion&lt;br /&gt;
returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID&lt;br /&gt;
controller with a battery backed write cache.  The cache is treated as&lt;br /&gt;
part of the disk system.  As soon as the data is in the cache, the&lt;br /&gt;
fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the&lt;br /&gt;
RAID controller will finish writing out the cache when power is&lt;br /&gt;
restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no&lt;br /&gt;
latency from head movement or waiting for the disk to rotate.  This&lt;br /&gt;
becomes more significant when writing many small files (like many FSFS&lt;br /&gt;
writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Reducing the number of writes: ==&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different&lt;br /&gt;
filesystem than the one holding the repository.  This is valuable when&lt;br /&gt;
the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Reduce directory index size: ==&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored&lt;br /&gt;
in subdirectories that don&#039;t grow past a specified size.  This allows&lt;br /&gt;
repositories to store many more revisions than can (efficiently) be&lt;br /&gt;
stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a&lt;br /&gt;
single directory.  However, performance can suffer as the directory&lt;br /&gt;
index starts to use multiple levels of indirection.  Some&lt;br /&gt;
administration tools may also have trouble with very large&lt;br /&gt;
directories.  Splitting the revision store into sub-directories avoids&lt;br /&gt;
all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in&lt;br /&gt;
&amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the&lt;br /&gt;
repository.  The default is 1000 revisions per subdirectory.&lt;br /&gt;
Non-sharded repositories can be loaded into a new, sharded,&lt;br /&gt;
repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Make write-once portions of the repository skip NFS cache checks: ==&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency&lt;br /&gt;
check is made every time a file is opened.  Since the revision files&lt;br /&gt;
in a FSFS repository never change, it is worthwhile to skip the cache&lt;br /&gt;
checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039;&lt;br /&gt;
option to the mount command (note: the man page claims this is&lt;br /&gt;
ignored, but it isn&#039;t on linux-2.6).  You need coherency for some&lt;br /&gt;
files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache&lt;br /&gt;
coherent mount point to the &#039;nocto&#039; mount for these directories: revs&lt;br /&gt;
and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
== Increase NFS caching timeout: ==&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period&lt;br /&gt;
of time.  This can be changed by passing the actimeo option to the&lt;br /&gt;
mount command.  The man page claims the default is 60 (seconds), but&lt;br /&gt;
some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger&lt;br /&gt;
(e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Distributing CPU load: ==&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting&lt;br /&gt;
differences in state, so the CPU load to calculate the difference can&lt;br /&gt;
be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load&lt;br /&gt;
and provide redundancy.  A network load balancer makes all front ends&lt;br /&gt;
(FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need&lt;br /&gt;
to ensure that the load balancer keeps an entire transaction on the&lt;br /&gt;
same FE (to allow transactions to be built up on local disk).  The&lt;br /&gt;
load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server.&lt;br /&gt;
You should also configure apache to keep a single TCP connection for&lt;br /&gt;
the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at&lt;br /&gt;
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== High storage system reliability: ==&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of&lt;br /&gt;
file/directory versions so you can retreive them in the future.  None&lt;br /&gt;
of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This&lt;br /&gt;
limits the loss to the changes that happened since the last backup.&lt;br /&gt;
If the repository is large and the commit rate is high, it may be&lt;br /&gt;
impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and&lt;br /&gt;
you do a backup every hour, you may lose 3600 revisions if the disk&lt;br /&gt;
fails.  This is a large scale example, but the point is to gather&lt;br /&gt;
your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID&lt;br /&gt;
technology.  This allows one (and sometimes more) disks to fail&lt;br /&gt;
without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring&lt;br /&gt;
and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you&lt;br /&gt;
can find various free alternatives by searching for &amp;quot;NFS server high&lt;br /&gt;
availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage&lt;br /&gt;
systems and requires a high bandwidth network (e.g. gigabit ethernet).&lt;br /&gt;
It reduces performance, but the caching optimizations listed above can&lt;br /&gt;
help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system&lt;br /&gt;
with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of&lt;br /&gt;
the system every few minutes and then transmits the difference between&lt;br /&gt;
the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It&lt;br /&gt;
can allow a backup filesystem to be located in another geographic&lt;br /&gt;
region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.&lt;br /&gt;
Setup svnsync to periodically sync a back up server off the main one.&lt;br /&gt;
This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs&lt;br /&gt;
&amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate&lt;br /&gt;
partition/server.  This creates a transaction log of commits that can&lt;br /&gt;
be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the benchmarking capabilites in mstone)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
	<entry>
		<id>https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1744</id>
		<title>Server performance tuning for Linux and Unix</title>
		<link rel="alternate" type="text/html" href="https://www.orcaware.com/svn/mediawiki/index.php?title=Server_performance_tuning_for_Linux_and_Unix&amp;diff=1744"/>
		<updated>2008-05-30T20:32:37Z</updated>

		<summary type="html">&lt;p&gt;Dchristian: Server performance tuning for Linux/Unix&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Server performance tuning for Linux/Unix ==&lt;br /&gt;
&lt;br /&gt;
Optimizing subversion server performance&lt;br /&gt;
Dan Christian&lt;br /&gt;
5 May 2008&lt;br /&gt;
&lt;br /&gt;
General notes:&lt;br /&gt;
&lt;br /&gt;
Most of these tips apply to both the Berkeley DB and the FSFS&lt;br /&gt;
repository formats.  FSFS is generally considered to be faster.&lt;br /&gt;
Database servers have the same kinds of problems and are a good source&lt;br /&gt;
for disk and operating system (OS) tuning advice.&lt;br /&gt;
http://dev.mysql.com/doc/refman/5.0/en/innodb-configuration.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Making reads cheaper:&lt;br /&gt;
&lt;br /&gt;
The Unix concept of &amp;quot;file access time&amp;quot; (commonly call atime) is a&lt;br /&gt;
performance problem.  When filesystem sematics were being defined, it&lt;br /&gt;
seemed like a good idea to know when a file was last accessed.  The&lt;br /&gt;
down side is that every file open call now causes a disk write.  A few&lt;br /&gt;
utilites use this information (e.g. tmpwatch and mail), but subversion&lt;br /&gt;
never uses atime.  Subversion performance is improved by avoiding the&lt;br /&gt;
access time writes.&lt;br /&gt;
&lt;br /&gt;
For a local filesystem, you can disable this behavior with mount&lt;br /&gt;
options.  On linux, it&#039;s the &#039;noatime&#039; and &#039;nodiratime&#039; options.  On a&lt;br /&gt;
NFS filesystem, the atime recording happens on the server and must be&lt;br /&gt;
disabled in the server&#039;s configuration.&lt;br /&gt;
&lt;br /&gt;
A lazy atime approach called &amp;quot;relatime&amp;quot; was introduced in Linux-2.6.20&lt;br /&gt;
and mount-2.13.  This eliminates most atime writes without breaking&lt;br /&gt;
the few utilities that need it.  This is most useful if the repository&lt;br /&gt;
must be on the same partition as the mail spool and/or temporary&lt;br /&gt;
files.  See: http://kerneltrap.org/node/14148 and&lt;br /&gt;
http://kernelnewbies.org/Linux_2_6_20&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Making writes cheaper:&lt;br /&gt;
&lt;br /&gt;
Subversion uses uses the fsync() call (or the equivalent on non-Unix&lt;br /&gt;
operating systems) to tell the operating system to write data to disk.&lt;br /&gt;
Up until that point, the data is usually only memory and the operating&lt;br /&gt;
system will write it to disk &amp;quot;when it gets around to it&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
By calling fsync() before finishing a commit, subversion is trying to&lt;br /&gt;
guarantee that everything it said had been done would be there when&lt;br /&gt;
the machine re-boots.  Waiting for data to write out to disk is often&lt;br /&gt;
the slowest part of a commit.&lt;br /&gt;
&lt;br /&gt;
However, the operating system doesn&#039;t always hold up its end of the&lt;br /&gt;
bargain.  On Linux, fsync() only ensures that the data is on its way&lt;br /&gt;
to the disk &amp;quot;as soon as possible&amp;quot;.  If write cache is enabled on the&lt;br /&gt;
drive, then it doesn&#039;t actually wait for the data to hit the disk&lt;br /&gt;
platter before returning.  This means there is a window of time that a&lt;br /&gt;
power loss can cause the disk state to not match what subversion&lt;br /&gt;
returned.&lt;br /&gt;
&lt;br /&gt;
One way to significantly increase fsync() performance is to use a RAID&lt;br /&gt;
controller with a battery backed write cache.  The cache is treated as&lt;br /&gt;
part of the disk system.  As soon as the data is in the cache, the&lt;br /&gt;
fsync() can safely return.  This means you don&#039;t have to wait for the&lt;br /&gt;
disk head seek or the data transfer.  If power is interrupted, the&lt;br /&gt;
RAID controller will finish writing out the cache when power is&lt;br /&gt;
restored.&lt;br /&gt;
&lt;br /&gt;
A newer way to avoid this problem is a flash based disk.  There is no&lt;br /&gt;
latency from head movement or waiting for the disk to rotate.  This&lt;br /&gt;
becomes more significant when writing many small files (like many FSFS&lt;br /&gt;
writes).  The current downsides of flash disks are high cost, limited&lt;br /&gt;
capacity, and low write bandwidth (but these problems are improving).&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reducing the number of writes:&lt;br /&gt;
&lt;br /&gt;
As of subversion-1.5, transactions can be built up on a different&lt;br /&gt;
filesystem than the one holding the repository.  This is valuable when&lt;br /&gt;
the repository lives on a slower filesystem like NFS.&lt;br /&gt;
&lt;br /&gt;
To implement this, do the following:&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  cd REPO_PATH/db&lt;br /&gt;
  mv transactions /LOCAL/DISK/PATH/&lt;br /&gt;
  ln -s /LOCAL/DISK/PATH/transactions .&lt;br /&gt;
  start the servers&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Reduce directory index size:&lt;br /&gt;
&lt;br /&gt;
The subversion-1.5 repository format allows the revisions to be stored&lt;br /&gt;
in subdirectories that don&#039;t grow past a specified size.  This allows&lt;br /&gt;
repositories to store many more revisions than can (efficiently) be&lt;br /&gt;
stored in one directory.&lt;br /&gt;
&lt;br /&gt;
Modern filesystems can handle hundreds of thousands of files in a&lt;br /&gt;
single directory.  However, performance can suffer as the directory&lt;br /&gt;
index starts to use multiple levels of indirection.  Some&lt;br /&gt;
administration tools may also have trouble with very large&lt;br /&gt;
directories.  Splitting the revision store into sub-directories avoids&lt;br /&gt;
all these problems.&lt;br /&gt;
&lt;br /&gt;
The shard size can by adjusted by editing the &amp;quot;layout sharded&amp;quot; line in&lt;br /&gt;
&amp;quot;db/format&amp;quot; after &#039;svnadmin create&#039; but before populating the&lt;br /&gt;
repository.  The default is 1000 revisions per subdirectory.&lt;br /&gt;
Non-sharded repositories can be loaded into a new, sharded,&lt;br /&gt;
repository using &amp;quot;svnadmin load&amp;quot; or &amp;quot;svnsync&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Make write-once portions of the repository skip NFS cache checks:&lt;br /&gt;
&lt;br /&gt;
If the repository is on a NFS filesystem, then a cache consistency&lt;br /&gt;
check is made every time a file is opened.  Since the revision files&lt;br /&gt;
in a FSFS repository never change, it is worthwhile to skip the cache&lt;br /&gt;
checks on these files.  The subversion-1.5 repository format store&lt;br /&gt;
immutable files in specific subdirectories so that this can be done.&lt;br /&gt;
&lt;br /&gt;
The NFS cache check can be disabled on Linux by passing the &#039;nocto&#039;&lt;br /&gt;
option to the mount command (note: the man page claims this is&lt;br /&gt;
ignored, but it isn&#039;t on linux-2.6).  You need coherency for some&lt;br /&gt;
files, so the NFS volume is also mounted without the option on a&lt;br /&gt;
different mount point.  Symbolic links are made from the cache&lt;br /&gt;
coherent mount point to the &#039;nocto&#039; mount for these directories: revs&lt;br /&gt;
and txn-protorevs.&lt;br /&gt;
&lt;br /&gt;
Implementation example (not complete, just an outline of the key steps):&lt;br /&gt;
  stop all servers that can write to the repository&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768&lt;br /&gt;
  sudo mount -t nfs nfs_server:/mount_point /mnt/svn-nocto -o \&lt;br /&gt;
    rw,nosuid,tcp,rsize=32768,wsize=32768,nocto,actimeo=3600&lt;br /&gt;
  cd /mnt/svn/repo_path&lt;br /&gt;
  mv revs revs-nocto&lt;br /&gt;
  mv txn-protorevs txn-protorevs-nocto&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/revs-nocto revs&lt;br /&gt;
  ln -s /mnt/svn-nocto/repo_path/db/txn-protorevs-nocto txn-protorevs&lt;br /&gt;
  start the servers&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
Increase NFS caching timeout:&lt;br /&gt;
&lt;br /&gt;
On Linux, metadata on files from NFS is only kept for a finite period&lt;br /&gt;
of time.  This can be changed by passing the actimeo option to the&lt;br /&gt;
mount command.  The man page claims the default is 60 (seconds), but&lt;br /&gt;
some experimentation suggests it may be higher than that.  For a&lt;br /&gt;
&#039;nocto&#039; mount point, this value can be raised to something much larger&lt;br /&gt;
(e.g. 3600).  See the above example.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Distributing CPU load:&lt;br /&gt;
&lt;br /&gt;
The subversion communicates with the clients by transmitting&lt;br /&gt;
differences in state, so the CPU load to calculate the difference can&lt;br /&gt;
be significant.  By storing the repository on NFS, you can have&lt;br /&gt;
multiple &amp;quot;front end&amp;quot; (FE) systems that share the computational load&lt;br /&gt;
and provide redundancy.  A network load balancer makes all front ends&lt;br /&gt;
(FEs) appear as one server to users.&lt;br /&gt;
&lt;br /&gt;
The FEs can either run svnserve or http-DAV.  If DAV is used, you need&lt;br /&gt;
to ensure that the load balancer keeps an entire transaction on the&lt;br /&gt;
same FE (to allow transactions to be built up on local disk).  The&lt;br /&gt;
load balancer must be configured with &amp;quot;machine affinity&amp;quot; set, so that&lt;br /&gt;
all HTTP connections from a client will be routed to the same server.&lt;br /&gt;
You should also configure apache to keep a single TCP connection for&lt;br /&gt;
the entire transaction (see example below).&lt;br /&gt;
&lt;br /&gt;
Apache configuration to maintain a TCP connection:&lt;br /&gt;
  # 1. Enable HTTP persistent connections so a single transaction can&lt;br /&gt;
  #    be built up over a single connection.&lt;br /&gt;
  KeepAlive             on&lt;br /&gt;
  # 2. Allow as many KeepAlives as required (0 =&amp;gt; infinite) to keep&lt;br /&gt;
  #    the same connection alive.&lt;br /&gt;
  MaxKeepAliveRequests  0&lt;br /&gt;
  # 3. Limit a child to serving only this 1 connection.&lt;br /&gt;
  MaxRequestsPerChild   1&lt;br /&gt;
&lt;br /&gt;
The last one is counter-intuitive, but see the &amp;quot;Note&amp;quot; at&lt;br /&gt;
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
High storage system reliability:&lt;br /&gt;
&lt;br /&gt;
The purpose of a version control system is to store a sequence of&lt;br /&gt;
file/directory versions so you can retreive them in the future.  None&lt;br /&gt;
of this matters if the storage system fails.&lt;br /&gt;
&lt;br /&gt;
The simplest step is to do periodic backups of the repository.  This&lt;br /&gt;
limits the loss to the changes that happened since the last backup.&lt;br /&gt;
If the repository is large and the commit rate is high, it may be&lt;br /&gt;
impossible to backup frequently enough to prevent significant data&lt;br /&gt;
loss.  For example, if your repository gets one commit per second and&lt;br /&gt;
you do a backup every hour, you may lose 3600 revisions if the disk&lt;br /&gt;
fails.  This is a large scale example, but the point is to gather&lt;br /&gt;
your own numbers and figure out how much you might lose.&lt;br /&gt;
&lt;br /&gt;
The next step is to make the disk system redundant using RAID&lt;br /&gt;
technology.  This allows one (and sometimes more) disks to fail&lt;br /&gt;
without losing data.  This still won&#039;t help if additional disks fail&lt;br /&gt;
during recovery or the entire array is lost due to fire, theft, etc.&lt;br /&gt;
&lt;br /&gt;
Advanced NFS servers can be configured to do synchronous mirroring&lt;br /&gt;
and/or asynchronous mirroring (also known as snapshot replication).&lt;br /&gt;
These capabilities are available in some commercial servers, or you&lt;br /&gt;
can find various free alternatives by searching for &amp;quot;NFS server high&lt;br /&gt;
availability&amp;quot; or &amp;quot;NFS server snapshot replication&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Synchronous mirroring sends every write to two independent storage&lt;br /&gt;
systems and requires a high bandwidth network (e.g. gigabit ethernet).&lt;br /&gt;
It reduces performance, but the caching optimizations listed above can&lt;br /&gt;
help.  The primary and slave systems are usually located in different&lt;br /&gt;
rooms (or buildings) and on different electrical circuits.&lt;br /&gt;
&lt;br /&gt;
Asynchronous mirroring periodically updates a second storage system&lt;br /&gt;
with changes from the master.  It periodically makes a &amp;quot;snapshot&amp;quot; of&lt;br /&gt;
the system every few minutes and then transmits the difference between&lt;br /&gt;
the previous snapshot and the current one to the slave.  This uses&lt;br /&gt;
less bandwidth, but lags the main filesystem by a several minutes.  It&lt;br /&gt;
can allow a backup filesystem to be located in another geographic&lt;br /&gt;
region.&lt;br /&gt;
&lt;br /&gt;
Another approach is to use subversion tools to maintain a mirror.&lt;br /&gt;
Setup svnsync to periodically sync a back up server off the main one.&lt;br /&gt;
This can lag behind by the polling interval, but it is simple to setup.&lt;br /&gt;
&lt;br /&gt;
You can eliminate the lag by setting up a post-commit script that runs&lt;br /&gt;
&amp;quot;svnadmin dump --incremental -r N&amp;quot; of that commit onto a separate&lt;br /&gt;
partition/server.  This creates a transaction log of commits that can&lt;br /&gt;
be replayed on a recent backup to restore full state.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
(TODO: reference the benchmarking capabilites in mstone)&lt;/div&gt;</summary>
		<author><name>Dchristian</name></author>
	</entry>
</feed>