How to Conduct a Software Audit

in Blogs

Fundamentals of Software Audit Data Collection

One of the tasks with which Scott & Scott’s clients most commonly request our assistance is how to conduct a software audit. While almost all companies, large and small, use software in order to conduct their business operations, many of those companies lack mature or even nascent software asset management practices. Consequently, when asked to cooperate with an audit of their software usage, they may not know where to begin and may consider themselves to be at the mercy of the auditors when it comes to information requests.

This state of affairs can be financially disastrous. Companies that cannot confirm what software they are using cannot effectively manage their portfolio of software licenses to support that usage. This almost always results in over-deployment and over-usage of software, which leads to exposure during audits. In addition, blind cooperation with all information requests received from software auditors can result in companies sharing more information than necessary to confirm their licensing obligations, which in turn can result in inflated compliance demands.

In order to effectively manage their software usage and to mitigate compliance exposure, companies need to know how to gather and analyze information regarding their product usage. While some software products may have unique data-collection requirements that ordinarily would not be applicable for other products, usage levels for many products can be measured using a common set of reports that companies can prepare themselves to gather. The purpose of this and other posts in this series of blog entries is to give companies insight regarding how to gather those datasets and why that information is relevant to their licensing obligations.

At a high level, there are five basic datasets that companies generally need to be prepared to collect in order to confirm their license positions for different software products. Future posts in this series will discuss each of those datasets and how to gather them in greater detail, but it is helpful at the outset to have an overview of what to expect.

Hardware Inventory – The backbone of deployment data for many products is a list of all devices in an IT environment capable of running or interacting with a licensed software product. This inventory should include all physical and virtual workstations and servers. In many cases, it also should include devices like thin clients that may not be capable of running installed copies of software, but that can be used to access software deployed in server environments. The list also should include information regarding the make and model of each device as well as the make, model and quantity of processors and processor cores for physical machines (virtual machines are addressed below). Also, in order to validate the completeness of the principal hardware inventory, it almost always is a good idea to obtain a secondary inventory from a source like Active Directory or an antivirus solution.
Virtualization Inventory – Most companies use software in virtualized computing environments where one or more virtual or “logical” servers are configured to run on one or more physical “host” machines. For these environments, it is necessary to gather information that can be used to map the virtual machines to their respective physical hosts. That information also should include details regarding the number of virtual processors allocated to each virtual machine and an indication of whether the virtual machines are capable of moving automatically from one physical host to another.
Software-Deployment Inventory – Once a company is capable of producing a list of the physical and virtual computing devices in its environment, it is necessary to gather information regarding what software products are installed and running on those devices. In many cases this software inventory can be gathered using the same tool that produces the hardware inventory. The information in this inventory should include the software publisher’s name, product name, version and edition (for example, Standard, Professional, Enterprise).
User Data – Many server products are licensed in whole or in part on the basis of the number of remote users or devices that access those products. When those products are deployed in company’s IT environment, it therefore is necessary to gather lists of those remote users or devices. In Windows-based environments, this information most commonly is obtained from Active Directory, which is a component of the Windows Server operating system. However, AD reporting may not always be a good source of information, to the extent that server applications are using “pooled” connections, where multiple remote users access a server through a single user account, often associated with a particular application. In those situations, it may be necessary to obtain user lists from those other applications or other sources in order to obtain a complete user list.
Entitlement Data – All of the preceding data sources are used to figure out how much product usage exists in an environment. Once usage is confirmed, it is necessary to compare those deployment counts against the kinds and quantities of software licenses held by a company. Some publishers, like Microsoft, provide relatively convenient options for obtaining comprehensive lists of all license products ordered under volume-licensing programs. However, in other cases it may be necessary to work with a company’s vendors in order to obtain documentation demonstrating what licenses previously have been purchased.

Again, it is important to keep in mind that certain software products may have product-specific licensing requirements that may not be measurable using the above data sources. For example, Microsoft Exchange Server client access licenses (CALs) are available in Standard and Enterprise editions, with the Enterprise CALs being required when certain Enterprise functionalities are enabled for user accounts. In order to determine which accounts require Enterprise CALs, it is necessary to obtain information regarding account configurations from the Exchange server. As another example, Oracle Database includes many added-cost options and packs that require additive licensing. In order to confirm a license position for Oracle Database, it therefore is necessary to run queries against each database server in order to obtain data regarding feature usage.

However, except for those product-specific situations, the above five data sources often are sufficient to enable a company to complete most of the work toward validating their license positions for many different products.