Subversion is a free/open source version control system (VCS). That is, Subversion manages files and directories, and the changes made to them, over time. This allows you to recover older versions of your data or examine the history of how your data changed. In this regard, many people think of a version control system as a sort of “time machine.”
Subversion can operate across networks, which allows it to be used by people on different computers. At some level, the ability for various people to modify and manage the same set of data from their respective locations fosters collaboration. Progress can occur more quickly without a single conduit through which all modifications must occur. And because the work is versioned, you need not fear that quality is the trade-off for losing that conduit—if some incorrect change is made to the data, just undo that change.
Some version control systems are also software configuration management (SCM) systems. These systems are specifically tailored to manage trees of source code and have many features that are specific to software development—such as natively understanding programming languages, or supplying tools for building software. Subversion, however, is not one of these systems. It is a general system that can be used to manage any collection of files. For you, those files might be source code—for others, anything from grocery shopping lists to digital video mixdowns and beyond.
Subversion is developed as a project of the Apache Software Foundation, and as such is part of a rich community of developers and users. We’re always in need of individuals with a wide range of skills, and we invite you to participate in the development of Apache Subversion. Here’s how to get started.
For helpful hints about how to get the most out of your visit to this site, see the About This Site section below.
Subversion exists to be universally recognized and adopted as an open-source, centralized version control system characterized by its reliability as a safe haven for valuable data; the simplicity of its model and usage; and its ability to support the needs of a wide variety of users and projects, from individuals to large-scale enterprise operations.
Is Subversion Right Tool
If you’re a user or system administrator pondering the use of Subversion, the first question you should ask yourself is: “Is this the right tool for the job?” Subversion is a fantastic hammer, but be careful not to view every problem as a nail.
As a first step, you need to decide if version control in general is required for your purposes. If you need to archive old versions of files and directories, possibly resurrect them, and examine logs of how they’ve changed over time, then version control tools can do that. If you need to collaborate with people on documents (usually over a network) and keep track of who made which changes, a version control tool can do that, too. In fact, this is why version control tools such as Subversion are so often used in software development environments—working on a development team is an inherently social activity where changes to source code files are constantly being discussed, made, evaluated, and even sometimes unmade. Version control tools facilitate that sort of collaboration.
There is cost associated with using version control, too. Unless you can outsource the administration of your version control system to a third-party, you’ll have the obvious costs of performing that administration yourself. When working with the data on a daily basis, you won’t be able to copy, move, rename, or delete files the way you usually do. Instead, you’ll have to do all of those things through the version control system.
Even assuming that you are okay with the cost/benefit tradeoff afforded by a version control system, you shouldn’t choose to use one merely because it can do what you want. Consider whether your needs are better addressed by other tools. For example, because Subversion replicates data to all the collaborators involved, a common misuse is to treat it as a generic distribution system. People will sometimes use Subversion to distribute huge collections of photos, digital music, or software packages. The problem is that this sort of data usually isn’t changing at all. The collection itself grows over time, but the individual files within the collection aren’t being changed. In this case, using Subversion is “overkill.” There are simpler tools that efficiently replicate data without the overhead of tracking changes, such as rsync or unison.
Once you’ve decided that you need a version control solution, you’ll find no shortage of available options. When Subversion was first designed and released, the predominant methodology of version control was centralized version control—a single remote master storehouse of versioned data with individual users operating locally against shallow copies of that data’s version history. Subversion quickly emerged after its initial introduction as the clear leader in this field of version control, earning widespread adoption and supplanting installations of many older version control systems. It continues to hold that prominent position today.
Much has changed since that time, though. In the years since the Subversion project began its life, a newer methodology of version control called distributed version control has likewise garnered widespread attention and adoption. Tools such as Git (http://git-scm.com/) and Mercurial (http://mercurial.selenic.com/) quickly rose to the tops of the distributed version control system (DVCS) ranks. Distributed version control harnesses the growing ubiquity of high-speed network connections and low storage costs to offer an approach which differs from the centralized model in key ways. First and most obvious is the fact that there is no remote, central storehouse of versioned data. Rather, each user keeps and operates against very deep—complete, in a sense—local version history data stores. Collaboration still occurs, but is accomplished by trading changesets (collections of changes made to versioned items) directly between users’ local data stores, not via a centralized master data store. In fact, any semblance of a canonical “master” source of a project’s versioned data is by convention only, a status attributed by the various collaborators on that project.
There are pros and cons to each version control approach. Perhaps the two biggest benefits delivered by the DVCS tools are incredible performance for day-to-day operations (because the primary data store is locally held) and vastly better support for merging between branches (because merge algorithms serve as the very core of how DVCSes work at all). The downside is that distributed version control is an inherently more complicated model, which can present a non-negligible challenge to comfortable collaboration. Also, DVCS tools do what they do well in part because of a certain degree of control withheld from the user which centalized systems freely offer—the ability to implement path-based access control, the flexibility to update or backdate individual versioned data items, etc. Fortunately, many wise organizations have discovered that this needn’t be a religious debate, and that Subversion and a DVCS tool such as Git can be used together harmoniously within the organization, each serving the purposes best suited to the tool.
On one end is a Subversion repository that holds all of your versioned data. On the other end is your Subversion client program, which manages local reflections of portions of that versioned data. Between these extremes are multiple routes through a Repository Access (RA) layer, some of which go across computer networks and through network servers which then access the repository, others of which bypass the network altogether and access the repository directly.
Subversion, once installed, has a number of different pieces. The following is a quick overview of what you get. Don’t be alarmed if the brief descriptions leave you scratching your head—plenty more pages in this book are devoted to alleviating that confusion.
The command-line client program
A program for reporting the state (in terms of revisions of the items present) of a working copy
A tool for directly inspecting a Subversion repository
A tool for creating, tweaking, or repairing a Subversion repository
A plug-in module for the Apache HTTP Server, used to make your repository available to others over a network
A custom standalone server program, runnable as a daemon process or invokable by SSH; another way to make your repository available to others over a network
A program for filtering Subversion repository dump streams
A program for incrementally mirroring one repository to another over a network
A program for performing repository history dumps and loads over a network.