Please enjoy reading this archived article; it may not include all images.

Data Governance for Privacy, Confidentiality and Compliance: A Holistic Approach

Data Governance for Privacy
Date Published: 1 November 2010

The digital era has created unprecedented opportunities to conduct business and deliver services over the Internet. Nevertheless, as organizations collect, store, process and exchange large volumes of information in the course of addressing these opportunities, they face increasing challenges in the areas of data security, maintaining data privacy and meeting related compliance obligations.

Forward-looking organizations are recognizing the need for a holistic approach to meet these challenges. In this context, “holistic” means an approach that enables the organization to address the following three objectives in a unified, cross-disciplinary way, rather than as three separate problems to be addressed by different groups within the organization:

  • Traditional IT security approaches that focus on protecting the organization’s IT infrastructure by securing the network edge and end points need to be augmented with protective measures that focus specifically on protecting the data that are stored and moved through that infrastructure.
  • Privacy-related protective measures must extend beyond those aspects of privacy that overlap with security, to include protective measures that focus on capturing, preserving and enforcing the choices customers have made with respect to how and when their personal information may be collected, processed, used and potentially shared with third parties.
  • Data security and data privacy compliance obligations need to be rationalized and addressed through a unified set of control objectives and control activities that meet the requirements.

Such an approach requires cooperation among the IT, human resources, legal and finance departments as well as business groups and the marketing department—in short, any group with a stake in collecting, processing, using and managing personally identifiable information (PII), intellectual property, trade secrets and other types of confidential information.

It is important to point out that the proposed approach to data governance for security, privacy, confidentiality and compliance does not call for modifying or replacing the organization’s existing information security management systems or IT governance processes. Rather, it augments them by specifying additional roles, tasks and technical tools that can help organizations better protect data privacy and security and satisfy compliance obligations.

This article presents an overview of the Data Governance for Privacy, Confidentiality and Compliance (DGPC) framework developed by Microsoft to assist organizations in creating a data governance program that addresses all three objectives in a holistic manner.1 In particular, this discussion focuses on the risk management portion of the DGPC framework.

Business Case for Data Governance

IT professionals may ask why they would want to employ yet another framework if they already have a successful IT governance process, a well-established control framework and an effective information security management system to meet their security needs and compliance obligations. There are two reasons for this:

  • Security standards and control frameworks tend to focus primarily on protecting the overall IT infrastructure and on aligning investments in that infrastructure with the organization’s business goals. In other words, they provide a view of the data security “forest.” The DGPC framework complements these elements in crucial ways by focusing on the “trees” of data security—on identifying and managing security and privacy risks related to specific flows of data that need to be protected, including personal information, intellectual property, trade secrets and market data. Such focus is necessary to identify additional, data-flow-specific protective measures and controls that need to be implemented to cover gaps—that is, residual risks that are specific to the data flow and that are not addressed by broader protective measures.
  • The DGPC framework creates a context that enables identification of threats against privacy, including privacy threats that do not overlap security threats such as violations of customer choice and consent with respect to what types of personal information are collected and how they are used, processed and shared.

The DGPC framework works in concert with the organization’s existing IT management and control frameworks, such as COBIT, and with security standards such as ISO/IEC 27001/27002 and the Payment Card Industry Data Security Standard (PCI DSS). To the author’s knowledge, no other existing industry framework provides this combination of benefits and integration.

DGPC Framework Components

The DGPC framework is organized around three core capability areas: people, process and technology. This section briefly summarizes the first two areas and offers a more detailed look at the technology-related considerations that are integral to threat identification and risk management.

People
Data governance processes and tools are only as effective as the people who use and manage them. An important first step is to establish a DGPC team that consists of individuals from within the organization and give them clearly defined roles and responsibilities, adequate resources to perform their required duties, and clear guidance on the overall data governance objectives. Essentially, this is a virtual organization whose members are collectively responsible for defining principles, policies and procedures that govern key aspects of data classification, protection, use and management. These individuals—commonly known as “data stewards”—also typically develop the organization’s access control profiles, determine what constitutes a policy-compliant use of data, establish data breach notification procedures and escalation paths, and oversee other related data management areas.

Process
With the right people involved in the DGPC effort, the organization can focus on defining the processes involved. This begins with examining various authority documents (statutes, regulations, standards, and company policies and strategy documents) that spell out requirements that must be met. Understanding how these legal mandates, organizational policies and strategic objectives intersect will help the organization consolidate its business and compliance data requirements (including data quality metrics and business rules) into a harmonized set.

The next step is to define guiding principles and policies that generate the appropriate context in which to meet these requirements. Last, the organization should identify threats against data security, privacy and compliance in the context of specific data flows; analyze the related risks; and determine appropriate control objectives and control activities.

Technology
Microsoft has developed an approach to analyze specific data flows and identify residual, flow-specific risks that may not be addressed by the information security management system’s and/or the control framework’s broader protective measures. This approach involves filling out a form called the Risk/ Gap Analysis Matrix, which is built around three elements: the information life cycle, four technology domains, and the organization’s data privacy and confidentiality principles. The following sections explain these concepts and discuss how they come together in the Risk/Gap Analysis Matrix.

Information Life Cycle
To identify residual risks and select appropriate technical measures and activities to protect confidential data, an organization must first understand how information flows throughout its systems over time and how the information is accessed and processed at different stages—by multiple applications and people and for various purposes. Understanding the risks within each life-cycle stage helps clarify what safeguards are needed to mitigate those risks.

Most IT professionals are well acquainted with these life-cycle stages, so discussing them in detail here is not necessary, except for one important facet: the need to include a transfer stage.

As data are copied or removed from storage as part of a transfer, a new information life cycle begins. Organizations should place as much emphasis on security and privacy for data that are being transferred as they do for the original data set. This requires understanding transfer vehicles—such as private networks, the Internet and storage media sent by courier—and their inherent risks. For example, media sent by courier or postal mail can be lost or stolen, so measures such as encryption should be taken to protect the data on those media. Data security also requires understanding how the recipient organization’s policies, systems and practices differ from those of the current keepers of the data. If the recipient does not have the same security capabilities and processes as the current keepers of the data, protections may need to be placed on the data or the process before transfer.

Other transfer challenges can arise when individuals and departments run reports or extract subsets of data from centralized databases for processing—particularly if they are using desktop data-mining and analysis tools that generate reports and data sets in the form of document files and spreadsheets. These files can also be easily transferred as e-mail attachments or saved to laptops, handheld smart devices or USB drives. Given that more than 60 percent of US data breaches in 2009 were attributed to lost or stolen laptops or media, organizations should closely monitor and protect such data transfers.2

Technology Domains
Organizations also need to systematically evaluate whether the technologies that protect their data confidentiality, integrity and availability are sufficient to reduce risk to acceptable levels. The following technology domains provide a frame of reference for this task:

  • Secure infrastructure—Safeguarding confidential information requires a technology infrastructure that can protect computers, storage devices, operating systems, applications and the network against malicious software, hacker intrusions and rogue insiders.
  • Identity and access control—Identity and access management (IAM or IdM) technologies help protect personal information from unauthorized access while facilitating its availability to legitimate users. These technologies include authentication mechanisms, data and resource access controls, provisioning systems, and user account management. From a compliance perspective, IAM capabilities enable an organization to accurately track and enforce user permissions across the enterprise.
  • Information protection—Confidential data require persistent protection because they are shared within and across organizations. Organizations must ensure that their databases, document management systems, file servers and practices correctly classify and safeguard confidential data throughout the life cycle.
  • Auditing and reporting—Technologies for systems management, monitoring and automation of compliance controls are useful for verifying that system and data access controls are operating effectively, and they are useful for identifying suspicious or noncompliant activity.

Data Privacy and Confidentiality Principles
The following four principles are meant to help organizations select technologies and activities that will protect their confidential data assets. They are high-level statements that can be followed by more detailed guidance—clear, concise statements or questions that inform the risk management and decision-making process.

  • Principle 1: Honor policies throughout the confidential data life span.3 This includes a commitment to process all data in accordance with applicable statutes and regulations, preserve privacy and respect customer choice and consent, and allow individuals to review and correct their information if necessary.
  • Principle 2: Minimize risk of unauthorized access or misuse of confidential data. The information management system should provide reasonable administrative, technical and physical safeguards to ensure confidentiality, integrity and availability of data.
  • Principle 3: Minimize the impact of confidential data loss. Information protection systems should provide reasonable safeguards, such as encryption, to ensure the confidentiality of data that are lost or stolen. Appropriate data breach response plans and escalation paths should be in place, and all employees who are likely to be involved in breach response should receive training.
  • Principle 4: Document applicable controls and demonstrate their effectiveness. To help ensure accountability, the organization’s adherence to data privacy and confidentiality principles should be verified through appropriate monitoring, auditing and use of controls. Also, the organization should have a process for reporting noncompliance and a clearly defined escalation path.

The Risk/Gap Analysis Matrix
The Risk/Gap Analysis Matrix, shown in figure 2, brings together the information life cycle, technology domains, and data privacy and confidentiality principles in a tool that helps organizations identify and address gaps in their existing efforts to protect data against privacy, confidentiality and compliance threats within a specific data flow. The matrix provides a unified view of the flow’s existing and proposed protection technologies, measures and activities.

Each row depicts a stage in the information life cycle. The first four columns in the matrix represent a technology domain, while the far-right column represents manual control activities that must take place to meet the requirements of the four data privacy and confidentiality principles at each stage of the information life cycle. The four principles form the basis of questions that will be asked for every cell of the matrix.

Assessing Risks With the Risk/Gap Analysis Matrix
The matrix gives organizations a powerful tool for risk assessment and mitigation. The analysis process and the following steps can help organizations identify gaps in existing protective measures and select corrective actions:

  • Step 1: Establish the risk analysis context—This involves defining the business purpose of the data flow; understanding how the data will be used and what systems are involved (defining the use cases); and identifying the privacy, security and compliance objectives for the flow.
  • Step 2: Perform threat modeling—Most threat-modeling techniques focus on security threats only, so they must be modified to detect nonsecurity-related threats involving privacy and noncompliance. Threat modeling involves two phases:
    – Diagramming involves creating a graphical representation of the data flow. Multiple techniques can be used for diagramming. Microsoft’s product teams and consulting services organization typically use data flow diagrams (DFDs) with the addition of “trust boundaries.” As shown a trust boundary is a border that separates business entities and/or IT infrastructure realms, such as networks or administrative domains. In this scenario, a customer provides PII to the application servers, which store it in servers administered by a cloud provider. Every transaction is logged in a log server that is administered by the same entity that administers the application servers. Every time confidential data cross a trust boundary, basic assumptions about security, policies, processes or practices—or all of these combined—may change, and, with them, the threats that will be identified in step 3. Note that in the diagramming step, the modeled entities typically represent systems and data stores rather than individual processes depicted in “traditional” application security threat modeling. A detailed description of DFDs and trust boundaries can be found in the “Microsoft IT Infrastructure Threat Modeling Guide.”4
    – Threat enumeration is a systematic analysis of the threat diagram, an example of which follows. In this context, a threat is not limited to attackers or technical threats, but can refer to anything that may violate any of the four data privacy and confidentiality principles. Organizations should use these principles to define categories of threats, which is an example of the output that results from applying threat enumeration to the data flow. The exact definition of the categories will depend on the organization’s unique policies and the applicable industry, geography and legal compliance framework.

The following is a threat enumeration example of what the definition of the threat categories described in step 2 may look like:5

– Principle 1: Honor policies throughout the confidential data life span.

  • Choice and consent (collection, use and disclosure)
    - Inadequate notice of data collection, use, disclosure and redress policies
    - Unclear or misleading language or processes for the user to follow in choosing and providing consent for the collection and use of personal information
  • Individual access and correction
    - Limited or nonexistent means for users to verify the accuracy of their personal information
  • Accountability
    - Lack of controls to enforce customer choice and consent as well as other relevant policies, laws and regulations

– Principle 2: Minimize risk of unauthorized access or misuse of confidential data.

  • Information protection
    - Lack of reasonable administrative, technical and physical safeguards to ensure confidentiality, integrity and availability of data
    - Unauthorized or inappropriate access to data
  • Data quality
    - Inability to verify accuracy, timeliness and relevance of data
    - Inability of users to make corrections as appropriate

– Principle 3: Minimize the impact of confidential data loss.

  • Information protection
    - Insufficient safeguards to ensure the confidentiality of data if they are lost or stolen
  • Accountability
    - Lack of a data breach response plan and an escalation path
    - Lack of system encryption of all confidential data
    - Inability to verify adherence to data protection principles through appropriate monitoring, auditing and use of controls

– Principle 4: Document applicable controls and demonstrate their effectiveness.

  • Accountability
    - Improper documentation of plans, controls, processes or system configurations
  • Compliance
    - Inability to verify or demonstrate compliance through existing logs, reports and controls
    - Lack of a clear noncompliance escalation path and process
    - Lack of a breach notification plan and other response plans that are required by law

Identifying these threat types offers a starting point for organizations to assess their data flows and consider how assumptions about privacy, confidentiality and compliance may change when a flow crosses a trust boundary, such as during transitions between life-cycle phases.

  • Step 3: Analyzing risk— Most organizations have already taken some steps to ensure data security and privacy, as specified by their existing control framework and/or information security management system. To complete this step, the organization should first gather information about its existing protective controls, technologies and activities. Then, for each cell in the Risk/Gap Analysis Matrix, it should determine which controls, technologies and activities support compliance with each of the four privacy and confidentiality principles. This step concludes when the threats that are not addressed by existing protective measures are identified in the appropriate cells of the matrix and when the related risks have been evaluated.
  • Step 4: Identifying mitigation measures—In the appropriate cells of the matrix, organizations should list additional controls, technologies and activities that are necessary to bring each risk to an acceptable level, and then evaluate the cost/benefit of each. This step concludes when the organization decides whether and how each identified risk will be mitigated, transferred or assumed.
  • Step 5: Evaluating the effectiveness of mitigation measures—Organizations should review the results of the preceding steps and reinitiate the cycle if unacceptable risks remain.

Conclusion

As organizations manage growing volumes of confidential data, they face increasingly complex challenges in protecting the data against theft, misuse or unauthorized disclosure. In addition, organizations need to take steps to prevent accidental collection or use of customer and employee personal information, in violation of each individual’s preferences, and also to meet related compliance obligations.

A program based on the Data Governance for Privacy, Confidentiality and Compliance framework complements existing security standards and control frameworks by providing a holistic approach to identifying data-flow-specific threats against data privacy, security and compliance and by addressing residual risks in effective and efficient ways.6

This article provides a high-level overview of the three components of the DGPC framework—people, process and technology—and a more detailed summary of key aspects of the technology component:

  • The use of data-centric threat modeling for security, privacy and compliance that complements but does not substitute for “traditional” security threat modeling, which is application-/process-centric.
  • The selection of appropriate controls, technologies and activities that address flow-specific residual risks through the use of the Risk/Gap Analysis Matrix. This is a simple tool that can help organizations understand how different technologies and protective measures come together in the context of an application’s information life cycle to treat the aforementioned risks.

Endnotes

1 For more information about the DGPC framework, see the white papers and webcasts of the series “A Guide to Data Governance for Privacy, Confidentiality, and Compliance,” available at www.microsoft.com/datagovernance.
2 Open Security Foundation DataLossDB, http://datalossdb.org
3 These policies may consist of requirements derived from laws, standards, promises, individual customer or employee choices, commercial obligations, and other sources.
4 For a detailed description of “traditional” threat modeling and how to build data flow diagrams, see “Microsoft IT Infrastructure Threat Modeling Guide,” available from http://technet.microsoft.com/en-us/library/dd941826.aspx.
5 For a more detailed list of technical and nontechnical questions that illustrate the types of threats against each principle, see the Application Privacy Assessment questionnaire available at www.microsoft.com/privacy/datagovernance.
6 To learn more about Microsoft’s DGPC framework and how Microsoft tools can be used to identify and mitigate gaps in existing protective measures, visit www.microsoft.com/datagovernance.

Javier Salido, CIPP
has 12 years’ field experience working with large enterprise and government customers, first as a consultant and later as director of services for Microsoft Consulting in Mexico, Argentina, Chile and Uruguay. A former researcher at the Network Security Laboratory at the University of Washington (USA), Salido has published multiple works at IEEE conferences, in IEEE/ACM Transactions on Networking and in the “A Guide to Data Governance for Privacy, Confidentiality, and Compliance” series published by Microsoft Corp. He currently works with the Microsoft Trustworthy Computing Group on privacy-related topics.