Transcription

Data Governance Best Practicesand Recommendations ReportTransportation System Management andOperations (TSMO) Vision and RegionalIntelligent Transportation Systems (ITS)Architecture UpdateFinal — May 24, 2019Prepared forPrepared byWith Support from Kimley-Horn, ConSysTec, and Lumenor

Data Governance Best PracticesVersion ControlVersionDateDescriptionEditor / Reviewer0.1 DRAFT8 April 2019Original versionICF0.2 DRAFT15 April 2019First internal reviewICF Team0.3 DRAFT17 April 2019Additional review and commentsICF Team0.4 DRAFT18 April 2019Draft DeliverableICF Team1.0 Final24 May 2019Final Deliverable; incorporated results from FHWA DataBusiness Plan WorkshopICF Teami

Data Governance Best PracticesTable of Contents1.INTRODUCTION . 41.1.SCOPE . 41.2.BACKGROUND. 41.3.DOCUMENT ORGANIZATION . 42.ARC CHALLENGES AND DATA GOVERNANCE BENEFITS . 53.DATA GOVERNANCE OVERVIEW . 83.1.DATA GOVERNANCE DEFINED. 83.2.DATA GOVERNANCE FRAMEWORKS . 94.DATA GOVERNANCE FRAMEWORK: BUSINESS STRATEGIES AND ORGANIZATION . 124.1.OVERVIEW. 124.2.DATA GOVERNANCE GOALS AND OBJECTIVES . 124.3.POLICIES . 144.4.ORGANIZATION . 164.5.PERFORMANCE AND MATURITY MODELS . 175.DATA LIFECYCLE MANAGEMENT. 206.CHANGING NEEDS IN TRANSFORMATIVE TRANSPORTATION ENVIRONMENTS . 226.1.GENERAL IMPLICATIONS . 226.2.SPECIFIC IMPLICATIONS FOR TRANSPORTATION DATA . 237.6.2.1.Integrated Transportation Management Systems Impacts . 236.2.2.Mobility on Demand and Accessible Travel Impacts . 236.2.3.Automated Vehicle Impacts. 24GETTING STARTED WITH DATA GOVERNANCE . 257.1.FHWA APPROACH. 257.2.MNDOT APPROACH . 267.3.FHWA CHALLENGES AND LESSONS LEARNED . 288.ARC’S ROLE IN A REGIONAL DATA GOVERNANCE FRAMEWORK . 318.1.8.1.1.Data Set Catalog. 328.1.2.Agency Roles and Responsibilities . 348.2.9.10.RECOMMENDATIONS FOR ESTABLISHING A REGIONAL DATA GOVERNANCE FRAMEWORK . 31RECOMMENDATIONS FOR ESTABLISHING A LONG TERM DATA GOVERNANCE FRAMEWORK . 34ENDNOTES. 37APPENDICES . 39ii

Data Governance Best PracticesAPPENDIX A: ACRONYM TABLE . 39APPENDIX B: WORKSHOP #2 DATA DISCUSSION SUMMARY . 3911.BIBLIOGRAPHY . 47List of FiguresFigure 1: FDOT ROADS Project Data Governance Overview . 9Figure 2: Data Governance and Stewardship Context Diagram from DMBOK2 . 10Figure 3: MnDOT Data Governance Roles . 16Figure 4: Extended Data Governance Roles for a Distributed Enterprise . 17Figure 5: IBM Maturity Model . 18Figure 6: IBM Model to Assess Effective Data Governance . 19Figure 7: DCC Curation Lifecycle Model . 20Figure 8: Digital Maps for AV Supply Chain (source: USDOT) . 24Figure 9: MnDOT Data Governance Framework . 27Figure 10: Data Business Plan Development Process . 31Figure 11: Federal Data Catalog (https://catalog.data.gov/dataset) . 32Figure 12: Snapshot of the Maryland Transit Data Catalog(https://data.imap.maryland.gov/items?page 3&tags CME). 33Figure 13: Longer Term Data Governance Framework Process . 34List of TablesTable 1: FHWA Data Governance Goals and Objectives . 12Table 2: Data Governance Principle Focus . 13Table 3: FHWA Data Governance Policies [source: FHWA, 2015]. 14Table 4: MnDOT Recommendations and Suggested Strategies for Data Governance. 27Table 5. Experiences, Benefits, Challenges and Lessons Learned from Implementing DataGovernance. . 29Table 6: Data Catalog Attribute List . 33iii

Data Governance Best Practices1. Introduction1.1. ScopeThis white paper on Best Practices for Data Governance (DG) explores industry recommendations on thepurpose, benefits, and strategies related to applying data governance to the role of the Atlanta RegionalCommission (ARC) with respect to “data”. The area of data governance is extensive, with every enterprisearchitecture and information technology business analysis methodology promoting a framework for people,processes, and technologies to manage their data.This paper will introduce the various topics related to data governance best practices, providing an extensivebibliography, however, the paper attempts to focus the reader on the role of ARC in fostering good datagovernance best practices rather than implementing regional data management processes.1.2. BackgroundEffective transportation systems management and operations (TSMO) at a regional scale involvescoordination among a wide array of partners -- including agencies involved in operating highways, transitservices, and emergency response services, as well as the private sector – to optimize system performance.Data sharing for enhanced situational awareness is key to implementing many TSMO strategies; moreover,sharing real-time data and predictive analytics with the public and private sectors plays an important role ininfluencing travel decisions. High-quality, consistent data that is managed over time and throughout its life isthe foundational element of multi-agency, regional approaches to TSMO.At the same time, public agencies operate with limited and shrinking resources, changing technologicallandscapes, and shifting roles and expectations. Data is becoming a major asset and investment. To managethese assets, government needs to become responsive with the ability to transform data into information and decisions;support interoperability in order to share information;effective and efficient data custodians to manage data discovery and access; anda trusted source to manage data quality and privacy.To that end, key data should be planned and managed to support the enterprise rather than just a project,which is currently what is often done. Data governance, therefore, is becoming increasingly important fororganizations and overall systems of organizations that work together. Many organizations that adoptdata governance practices recognize the need to undergo a cultural transformation, changing the waysindividuals and systems handle and process data. TSMO strategies also require cultural change, thus,there may be no better time for ARC and its stakeholders to adopt data governance practices then now.1.3. Document OrganizationThis document describes best practices in data governance as applied by transportation organizations. Thedocument is organized as follows:Section 2 ARC Challenges and Data Governance Benefits. Section 2 introduces the benefits of datagovernance given current challenges in data access and exchange in the ARC region.4

Data Governance Best PracticesSection 3 Data Governance Overview. This section describes data governance, its definition and frameworkcomponents. Many organizations claim to be the authority over the best practice. This section summarizesand identifies the commonalities of the various methodologies.Section 4 Data Governance Framework: Business Strategies and Organization. Section 4 describes thevarious data governance framework components and how transportation organizations implement them. Thecomponents include goals and objectives, policies, organizational models and maturity models.Section 5 Data Lifecycle Management. Section 5 describes an overview of the data lifecycle and categoriesof plans and procedures that are included to curate data over if life.Section 6 Changing Needs in Transformative Transportation Environments. This section describes futurechallenges with changing and emerging transportation technologies and strategies. Detailed topics includeimpacts due to integrated transportation management systems, mobility on demand and accessible travel, andautomated vehicles.Section 7 Lessons Learned from Government Initiatives. Section 7 describes the approach used bytransportation agencies to get started – what drives them to adopt data governance frameworks, how they getstarted, and steps recommended by the USDOT to set up a data governance framework.Section 8 ARC’s Role in a Regional Data Governance Framework. Section 8 provides a set ofrecommendations for ARC to initiate data governance for regional constituents and stakeholders (both internaland external). The section includes recommendations for ARC’s role and responsibilities, as well as plans andartifacts needed to promote good data management practices for the region.Section 9 End Notes. This section provides notes and references for citations contained in the report.Appendices. The appendices section includes Appendix A Acronym Table and Appendix B -- the results fromthe data exercises conducted during Workshop #2 (2019 March 18).Bibliography. The Bibliography, though providing the full references for the end notes, provides exampledocuments that ARC can use to model framework components. In particular, the Data Business Plan(Hillsborough ) describes2. ARC Challenges and DataGovernance BenefitsIn the ARC TSMO Vision and Intelligent Transportation System (ITS) Regional Architecture project workshop#2, participating agencies were asked about their major challenges with sharing data. From among the 53participants, organizations identified many challenges when it comes to collecting, analyzing and sharing data,many of which can be addressed by establishing data governance policies, procedures, and standards. Arobust data governance (DG) framework directly addresses these issues by defining/providing the rightprocedures, standards, and policies to manage data. In this sense, establishing a DG framework will increasedata interoperability, quality, sharing and effectiveness as well as reduce costs.In the workshop, when asked to describe their three major challenges when sharing data with otherorganizations, five common themes emerged from participants’ responses. The top five issues identified by thestakeholders, along with insight on how adapting a data governance framework helps, are described below.Challenge #1: Inconsistent access / Challenges to access (platform) / Data discovery. Agencies andstakeholders have different data sharing platforms with varying levels of access/security, which yieldsinconsistent access to data and even inconsistent data across stakeholders—i.e., no clear guidance on how toexpose what data exists and which organization or department has it. The challenge is both knowledge5

Data Governance Best Practicesoriented and technology-related: What data is available? From who? Where is it stored? How often is itupdated? And how can it be accessed without major technology issues? DG provides “rules of engagement” which describe user- and owner-roles, access procedures,and methods of exposing and describing data sets irrespective of technology or platform. Therules of engagement promulgated by a data governance framework helps improve sharingefficiencies through the adoption of: (1) data discovery services to support searching for dataacross multiple organizations; and (2) technology-agnostic and role-based data access methods.Challenge #2: Inconsistent structures, formats, and semantics. Current systems have incompatible datadescriptions that makes it difficult to understand data in detail (e.g., type, meaning, scale, temporal, coverage,estimated vs. observed) and to integrate it into their systems. DG framework provides rules and guidelines on how to describe, organize, and share data. Thisensures that all data is collected, named, defined, and grouped consistently and according tostandards across all stakeholders, including vendors. This standardization also helps with anyfuture system/software integration, as agencies now operate using a consistent data organizationand structure. Finally, this standardization facilitates aggregating data to provide key performancemetrics in an efficient manner.Challenge #3: Unclear data responsibility. Currently, there is limited organizational structures that specifymanagement, accountability, and audit responsibilities for data. These details include accountability forupkeep, quality, description, and dissemination to downstream users, including sharing information to otherstakeholders. Data curation is often overlooked once data is collected or an application is deployed.Maintaining information about data quality, lineage, point of contact, or storage location may not be maintainedbecause there is no one assigned to manage the data. DG framework identifies the need to define the roles and responsibilities of data owners,stewards, and users of data over its lifecycle. A common theme of data governance is therelationship of people to data: who is responsible for data curation, who is responsible forensuring the data serves enterprise needs, who is accountable for the quality and access to thedata, who owns the data, how is the data used, specifically as it relates to privacy issues. Assuch, a DG framework designates roles such as data stewards, data custodians, data policycommittees, and data champion as well as the responsibilities assigned to each.Challenge #4: Data restrictions. The lack of defined data ownership and rights to data also leads to uncleardata distribution and use, that is, what can or cannot be shared due to contracting agreements or licensingrestrictions? By articulating data policies for sharing and use, a DG framework clarifies the distribution andprivacy rules for requests made by internal and external stakeholders. As such, DG helps insetting clear relationships to manage shared data and information exchanges among internal andexternal stakeholders, addressing any policy or legal limitations for sharing data.Challenge #5: Limited and costly resources to manage data. There is a vast amount of data beingcollected and processed for static and real time consumption. Collecting, managing, and distributing this datarequires resources, including human, that may exceed the financial capabilities of the stakeholders. A common feature of most DG frameworks involves the development of a data business plan thato prioritizes critical data needs,o identifies redundancies in data collection, processing and storage,o develops strategies for migrating manual collection and quality control to automatedprocesses, and frames organizational responsibilities for data stewards who role is tomanage data for the enterprise6

Data Governance Best PracticesAdditional challenges were articulated by workshop participants include: Data needs with respect to interfaces and quality that will support my objectives and outcomesPrivacy issues and policiesGeographic data inconsistenciesA complete set of the challenges as well as current and future data sharing needs are included in Appendix B:Workshop #2 Data Discussion .7

Data Governance Best Practices3. Data Governance Overview3.1. Data Governance DefinedData governance as a discipline has been part of enterprise architectures and information technology (IT)processes since the early 1980s and defined by many groups. Depending on the purpose, different definitionsof data governance focus on specific core values that are critical to that group, for example:MDM Institute1 defines data governance as:“the formal orchestration of people, processes, and technology to enable an organization to leveragedata as an enterprise asset”With a general focus on people, processes, and technology.Forrester2 defines data governance as:“A strategic business program that determines and prioritizes the financial benefit data brings toorganizations as well as mitigates the business risk of poor data practices and quality. At the heart ofthis program is ownership, accountability, processes, planning, and performance management. “With a focus on the fiduciary responsibility and organizational planning for managing data.Data Governance Institute3 defines data governance as:“a system of decision rights and accountabilities for information-related processes, executedaccording to agreed-upon models, which describe who can take what actions with what information,and when, under what circumstances, using what methods.”With a focus on mid-level manager responsibilities and rules of engagement.NASCIO4 defines data governance as:“the operating discipline for managing data and information as a key enterprise asset. This operatingdiscipline includes organization, processes and tools for establishing and exercising decision rightsregarding valuation and management of data. Key aspects of data governance include decisionmaking authority, compliance monitoring, policies and standards, data inventories, full lifecyclemanagement, content management, records management, preservation, data quality, dataclassification, data security and access, data risk management, and data valuation.”NASCIO reframes the definition to cover “information or knowledge management governance”. In thisenvironment of social media, big data, and unstructured data, the renaming may be appropriate.With a focus on rules of engagement and operating principles for managing data.To more fully appreciate what data governance is, it is best to understand what it is not. According to OracleBest Practices in Data Governance (2011) and Forrester, Data Governance is not data management oradministration, data cleansing, master data management, or data storage/warehouse.The common, recurring theme from the various DG definitions may be summarized as theRules of engagement for how institutions (people and policies) manage and sustain data across theenterprise, over its lifecycle5.Enterprise in this context includes an organization and internal and external stakeholders. For the remainder ofthis document, these common themes represent data governance.8

Data Governance Best Practices3.2. Data Governance FrameworksA DG framework describes how all the pieces that compose data governance fits together. According toNASCIO,“frameworks [in general] assist in describing major concepts and their interrelationships. Frameworksassist in organizing the complexity of a subject. Frameworks facilitate communications anddiscussion. All of these descriptors apply as well to frameworks related to data governance.Additionally, data governance frameworks assist in demonstrating how data governance relates toother aspects of data management, data architecture, and enterprise architecture.”The Florida DOT (FDOT) Reliable, Organized, Accurate, Data Sharing (ROADS) Project Data GovernanceOverview presents a simplified relationship among the aspects of the framework as shown below in Figure 1.On the left side of the figure are the people (roles) associated with the framework, and the right side lists highlevel responsibilities and processes.Figure 1: FDOT ROADS Project Data Governance OverviewData Management Association International (DAMA), a formal data governance organization, published a datamanagement body of knowledge (DMBOK2, published July 2017) that provides detailed and comprehensivecontext diagrams that include goals for each objective; business and technical drivers; activities and roles; andinputs and outputs. An example of one of these context diagrams is illustrated in Figure 2.9

Data Governance Best PracticesFigure 2: Data Governance and Stewardship Context Diagram from DMBOK210

Data Governance Best PracticesThe DAMA DG knowledge area cites four major objectives: Data Governance and StewardshipBusiness Cultural DevelopmentData in the CloudData Handling EthicsEach objective has its own context diagram much like the one shown in Figure 2.Generally, IT Governance and enterprise architecture methodologies and tailored DG frameworks incorporatedifferent aspects of data governance; however, there are consistent recurring themes throughout these variousmodels, including the following characteristics: Accountability and leadership roles in organizationPlanning and rules for data handling – quality, integrity, accessStrategic enterprise perspectiveCultural change to a data-centric organizationThe following sections describe critical aspects of the DG Framework.11

Data Governance Best Practices4. Data Governance Framework:Business Strategies and Organization4.1. OverviewInitiating a DG framework is similar to developing a strategic plan. It starts with articulating a vision as well asobjectives and goals for managing, sharing, and accessing data. In the case of ARC and its stakeholders, DGincludes managing and sharing information across organizations. For ARC, the “enterprise” consists of manytransportation and planning organizations within the region, with each organization responsible for collecting,managing, and curating the same or similar data, allocating resources and applying their own policies andprocedures to data curation activities. The data enterprise, in the ARC region, is a distributed, heterogenousenvironment. Most data governance frameworks assume a single organization. Transportation agencies haveadapted the enterprise DG framework to extend to a multimodal, multi-jurisdictional environment, one in whichARC can play a pivotal role. The regional data governance framework comes from aligning the elements of theframework – goals, objectives, policies, procedures, organization with other regional visions such as the TSMOvision, goals and objectives. This section introduces elements of the DG framework and identifies methods toextend these elements to fit a regional DG model.4.2. Data Governance Goals and ObjectivesData governance goals and objectives are derived from organizational vision, goals and objectives. Someemerge due to a major challenge or as an initiative to support another initiative. For example, state DOTsrecognize that data governance is essential to developing and coordinating asset management systems,feature layers and linear referencing systems for their Geographic Information Systems. Initiating datagovernance on a project basis and using the project as a platform to expand to other domains has worked formany organizations. To that end, DG goals and objectives tend to focus on the problems encountered as wellas good strategic planning practices. In ARC’s TSMO Visioning Workshop Summary6 many goals tended tofocus on sharing data not only between public sector organizations but also between “public and dataproviders and users.”7 Goals tended to identify areas such as data integration, access and quality. These aretypical goals and objectives described by DG frameworks. Examples of goals and objectives cited bytransportation agencies are included below.In its DG Primer, the Federal Highway Administration (FHWA) identifies a sample set the goals and objectivesfor data governance (FHWA, 2015). The goals and objectives are listed in Table 1.Table 1: FHWA Data Governance Goals and ObjectivesGoalObjectivesLeadership – Champion datasolutions to ensure accountabilityand increase the value of dataassets. Promote data governance within FHWA. Communicate data-related changes to all interested parties. Monitor progress and ensure accountability of data governancetasks and projects.Quality – Oversee efforts toprovide acceptable quality datathat is accurate. Establish a Data Quality Assurance Program. Increase the accuracy and clarity of data. Improve accessibility of data.12

Data Governance Best PracticesPrioritization – Prioritize efforts toaddress data gaps and needs. Establish clear priorities to address data gaps and needs. Communicate priorities to FHWA business units.Cooperation – Facilitate crossorganizational collaboration, datasharing, and integration. Flexibility – Encourage creativeand innovative solutions to dataneeds.Utilization – Improve datautilization and ease of access.Increase opportunities for data sharing.Eliminate data silos and other barriers.Ensure business units know the identity of Data Stewards.Ensure Data Stewards know the identity of Data Users.Identify innovative data solutions throughout FHWA.Communicate innovative solutions to Data Stewards and DataUsers. Promote appropriate data usage throughout FHWA. Provide staff the means to determine the extent and availability ofFHWA data.Other organizations, such as Colorado and MnDOT address objectives such as: Build a culture of data cooperation by involving all organization members in data collaboration(knowledge, access, accountability, use)Promote knowledge of data and reduce riskDevelop guidelines that incorporates managing information value and reducing riskUnderstand and measure benefits of data management practicesInvolve business and IT in procurement decisions that incorporate data value Data Governance PrinciplesSimilar to objectives, though stand-alone, are data governance principles. Principles statements are valuesused to guide organizations with their priorities. When unforeseen issues occur, principles provide the needsand ideals that drive decisions. To guide priorities for data governance, organizations may develop a set ofprinciples that focus on their values. For example, as seen in Table 2, Colorado Data Organization (CDO)’sfocus is on business strategies

artifacts needed to promote good data management practices for the region. Section 9 End Notes. This section provides notes and references for citations contained in the report. Appendices. The appendices section includes Appendix A Acronym Table and Appendix B -- the results from the data