AASHTO Core Data Principles

Data Management and Analytics

In 2013, this set of AASHTO Core Data Principles was adopted by the data and planning community. The principles were developed and vetted by the:

  • SCOP data sub-committee,
  • the CTPP oversight board (data community),
  • and US DOT data leadership

The principles have been vetted through SCOP, and key state and federal transportation data staff.

Our 7 Core Data Principles Defined

Principles 1-3

Data report

Principle 1 - VALUABLE: Data is an Asset

Data is a core business asset that has value and is managed accordingly

Rationale — Data is a core industry asset that has measureable value and is managed accordingly. Accurate, timely data is critical to accurate, timely decisions. Transportation agencies already manage many of their physical assets: roads, bridges, signs, lights, etc. Data is no different and must be treated like other physical assets. Data is the foundation of our decision-making, so we must also carefully manage and maintain data to ensure that we know what we have and where it is, can rely upon its accuracy, and can obtain it when and where we need it. Where possible, data should be archived to maintain historical records.

Implications — Treating Data as the asset that it is saves money, effort and resources.  When data is appropriately handled it can have a long life with many uses beyond its original one, and serve projects as yet unplanned.

Principle 2 - AVAILABLE: Data is open, accessible, transparent and shared

Access to data is critical to performing duties and functions, data must be open and usable for diverse applications and open to all.

Rationale — The value of data is increased when it can be used with other data and in a variety of applications. Users must have access to the data critical to their duties and functions. Wide access to data leads to efficiency and effectiveness in decision-making, and affords timely response to information requests. Using data must be considered from an enterprise perspective (across the organizations or across multiple organizations) to allow access by a wide variety of users. Transportation agencies at all levels of government (federal to state to local) hold a wealth of diverse data sets, but it is often stored in different databases that are incompatible with each other or difficult to find. Timely access to accurate data is essential to improving the quality and efficiency of decision-making. It is less costly to maintain timely, accurate data and then share it, than it is to maintain duplicative data in multiple locations or processes. Shared data will result in improved decisions since we will rely on fewer sources of more accurate and timely managed data for decision-making. Sharing is also necessary to triangulate on subjects that may not be measured directly, and allows for serendipity. Insights often come from bringing fresh eyes to data.

As transportation organizations work with more stakeholders and external partners, it is essential that data be shared. Making data electronically available will result in increased efficiency when existing data entities can be re-used. It is more effective to de-protect transportation data than it is to over-protect.

Implications — Agencies are increasingly under in informal mandate to “do more with less.” Sharing data is a key step in executing this mandate. Accessible data will ultimately reduce burden on staff time as data becomes more accessible.

Principle 3 - RELIABLE: Data quality and extent is fit for a variety of applications

Data quality is acceptable and meets the needs for which it is intended.

Rationale — Data quality is acceptable and meets the need for which it is intended. Data that is collected, produced, and reported must be fit for purpose. That is, of sufficient accuracy and integrity proportional to its use and cost of collection and maintenance. Data is used in all areas of the transportation decision-making process from planning to design to operations to performance management. Furthermore, it is increasingly being used externally by citizens and customers to inform their personal decisions, and by stakeholders to assess the aggregate performance of a transportation organization. Significant human and system resource is consumed in the collection, manipulation and dissemination of data whether of high quality or not, so it is essential that the most effective use of public funds is achieved through appropriately directed attention to data quality and the procedures to realize quality. Additionally, data must be archived appropriately to preserve both its usefulness and the historical record. When possible, data should be spatially oriented. Data quality increases as the application of the data increases. Data that has spatial orientation or attribution can easily be used in GIS systems. When data assets can be analyzed in a spatial context, not only can a greater analysis be completed in terms of geographic context, but also the data and any analysis results can be more easily communicated via mapping and other formats more applicable to public understanding.

Implications — When data is fit for purpose appropriate cost decisions are made in its collection and use. In cases where a rough sketch is appropriate, appropriate data collection and use may follow. Where large programs, investments, or systems are being developed and vetted, those data must be fit for that purpose. Data precision is matched to the task at hand.

Our 7 Core Data Principles Defined

Principles 4-5

Lock on data

Principle 4 - AUTHORIZED: Data is secure and compliant with regulations

Data is trustworthy and is safeguarded from unauthorized access, whether malicious, fraudulent or erroneous

Rationale — Data is trustworthy and is safeguarded from unauthorized access, whether malicious, fraudulent or erroneous. Open sharing of information and the release of information via relevant agreement must be balanced against the need to restrict the availability of classified, proprietary, and sensitive information.

Implications — When data is secure and appropriately regulated there is greater trust and confidence in its use.

Principle 5 - CLEAR: There is a common vocabulary and data definition

Data dictionaries are developed and metadata established to maximize consistency and transparency of data across systems.

Rational — Both unstructured and structured data must have a common definition to enable sharing of data. However, data must not be compromised below the use of its original purpose. Commonality may take the form of relations, bridges and crosswalks between definitions

Implications — A common vocabulary will facilitate communications, enable dialogue to be effective and facilitate interoperability of systems, however, utility must not be compromised.

Our 7 Core Data Principles Defined

Principles 6-7

data presentation

Principle 6 - EFFICIENT: Data is not duplicated

Data is collected once and used many times for many purposes.

Rationale — Development of information services should be made available to multiple users and stakeholders and is preferred over the development of information and data silos which are only used for a single purpose or user.

Implications — Duplicative capability is expensive and propagates conflicting data. It also goes against a policy of sustainability in the use of data and the infrastructure resources required to maintain the data, such as computer servers and data warehouses.

Principle 7 - ACCOUNTABLE: Decisions maximize the benefit of data

Timely, relevant, high-quality data are essential to maximizing the utility of data for decision making.

Rationale — The purpose of data collection is to help support the decision-making process. Users of the data, as well as information derived from the data, are the key stakeholders in the data collection and analysis process. The data is being collected to address a certain policy goal or objective. In order to ensure information management is aligned with the purpose, users must be involved in the different aspects of the information environment. The decision makers, managers, and the technical staff responsible for developing and sustaining the information environment need to come together as a team to jointly define the goals and objectives of the data collection processes.

Implications — Resources are limited. Maximizing existing resources is essential.