On October 15–16, 2019, the MIT Center for Transportation and Logistics hosted representatives from 18 organizations for a roundtable on data management for machine learning in the supply chain. Six sessions focused on: 1) the importance of data, 2) managing organizational transformation, 3) organizational data governance, 4) data collection, 5) data wrangling, and 6) data visualization. Each session began with a short presentation followed by wide-ranging discussions. To encourage candor, statements have not been attributed to named organizations.
The first session focused on how the rapidly declining costs of data storage, bandwidth, and computation foster greater supply and demand for data, and promote the development of fundamentally new data processing methods such as machine learning. However, getting value from data requires transforming organizations to have better data management and data governance.
The next two sessions covered organizational issues of transformation and governance. Although data management seems to belong to IT, roundtable participants suggested that business stakeholders should own the data because they know where it comes from, what it means, who needs to see it, and how often it needs updating. Harmonizing both data and KPIs—creating one source of the truth—enables greater and more consistent use of data but requires either stakeholder alignment or a top-down mandate. Firms represented at the event were prioritizing digital projects through use cases and strategic imperatives.
The final three sessions covered the data supply chain within the organization. First, data collection gathers diverse types of data from diverse sources as needed for various business decisions. Second, data wrangling cleanses, harmonizes and processes data for business users. Third, visualization requires user-centered design of dashboards and graphics to support different types of users: executives, operational managers, and front-line people. The goal is timely and actionable information that does not overwhelm the user. Finally, pilot projects in these areas must be converted to robust production systems to avoid technical debt.
Many of the discussions highlighted the key role of people, especially data scientists and data stewards. Hiring and retaining these essential specialists is challenging but can be achieved through tactics such as providing incentives and formal career paths for individuals. Finally, change management at all levels drives support for and use of data-driven decision-making processes.
Most companies were in the early stages of digital transformation—much work remains for the future. Participants visited CTL’s Computational and Visual Education (CAVE) Lab to see how advanced visualization can be used on complex supply chain challenges.
After almost two days of presentations, deep dives, and discussions, participants cited key takeaways and planned next steps. Participants planned a range of future applications including end-to-end integration, a shift from descriptive to more predictive and prescriptive analytics, and more sophisticated statistical models. Future roundtables may delve into issues such as the cross-functional challenges of data management and the need for more effective ways to visualize data.