Synergizing Data Management, DataOps, and Data Pipelines for AI-Enhanced Embedded Systems
Doctoral thesis, 2024
Objectives: This thesis is structured around three primary objectives. The first objective is to comprehensively understand and address the data management challenges associated with embedded systems. Building upon this understanding, the second objective is to explore the data management practices that can help alleviate the challenges of data management. Finally, the third objective aims to develop and validate the implementation approaches for enhanced data management.
Method: To achieve the objectives, we conducted research in close collaboration with industry and used a combination of different empirical research
methods like interpretive case studies, literature reviews, and action research.
Results: This thesis presents six main results. First, it identifies and categorizes data management challenges, solutions, and limitations. Second, it presents a stairway model delineating the stages of the evolution towards DataOps. Third, it proposes a model for evaluating the maturity of data pipelines and identifies determinants to assess the impact of machine learning (ML) on data pipelines. Fourth, it identifies the differences between unidirectional and bidirectional data pipelines and the significance, benefits, and challenges of bidirectional data pipelines. The thesis also provides a roadmap for the smooth migration from unidirectional to bidirectional data pipelines. Fifth, it presents and validates the conceptual model of an end-to-end data pipeline for ML/DL models. Finally, it presents and validates fault-tolerant data pipelines and an AI-powered 4-stage model for automated fault recovery in data pipelines.
Conclusion: In conclusion, this thesis demonstrates a well-structured approach to data management in AI-enhanced embedded systems, supported by
innovative practices and robust implementation approaches, that is essential for ensuring the reliability, and effectiveness of data in decision-making processes.
Robustness
Bidirectional
Fault-Tolerance
Automated Fault Recovery
Data Pipelines
Data Management Challenges
DataOps Evolution
Author
Aiswarya Raj Munappy
Software Engineering 1
My research endeavors to revolutionize data management practices at an industrial scale. Through an in-depth exploration, we uncovered a multitude of challenges in managing data for deep learning applications. By combining insights from real-world case studies and cutting-edge literature reviews, we dissected the current state of data management approaches, paving the way for a more robust and efficient methodology. With the surge of Artificial Intelligence (AI) technologies, our study takes a bold step forward by modelling a resilient data pipeline tailored for AI-enhanced embedded systems. This pipeline acts as a lifeline, guiding the flow of data seamlessly through complex networks, ensuring reliability and accuracy at every stage. But our journey doesn't end there. We draw attention to the faults that are frequently missed but exist in data pipelines. By identifying these faults and implementing proactive mitigation strategies, we empower industries to navigate the challenges of data management with confidence, minimizing human interventions, and maximizing productivity.
This thesis serves as a guide for both academia and industry. Researchers are invited to delve deeper into the practical challenges of data management and data pipelines left unexplored. Meanwhile, industry practitioners are encouraged to reflect on the pivotal role of adopting tailored data management and data pipeline practices, particularly in the domain of AI-enhanced embedded systems. As we embark on this transformative journey, let us embrace the power of data management as a catalyst for innovation and progress, propelling us towards a future where data-driven decisions shape a brighter tomorrow.
Software Engineering for AI/ML/DL
Chalmers AI Research Centre (CHAIR), 2019-11-01 -- 2022-11-01.
Infrastructure
C3SE (Chalmers Centre for Computational Science and Engineering)
Subject Categories
Software Engineering
ISBN
978-91-8103-052-5
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5510
Publisher
Chalmers
Mötesrum 473 is located on Campus Lindholmen. Go to building Jupiter. Entrance from Hörselgången 5. Go to floor 4.
Opponent: Xaioefng Wang, university of Bolzano, Italy