Data Lake

The ALSET Educational Data Lake

We manage the ALSET Educational Data Lake, an exciting resource for education researchers, policymakers, and innovators. The Data Lake securely houses data gathered from across the university. This includes campus IT systems, student surveys, research study results, and much more.


Launched in 2016, the Data Lake now includes anonymised data on over 170,000 NUS students and alumni. The extent of our data is growing all the time—the image below shows the available datasets and those that we are working to incorporate into our Data Lake in 2020.


To request access to the current data catalogue, please click here.

For education and training purposes, we also maintain synthetic datasets that mimic the structure and content of the Data Lake. To make a request to view the documentation about our synthetic datasets, please click here.


Data Collection and Policy Analysis

To continuously grow the Data Lake, ALSET manages several longitudinal data collection efforts that will yield insight on how NUS undergraduates learn over their academic careers and perform in the workforce. Many of these projects are long-term investments in establishing NUS as a global hub for educational research.

One important use case of the Data Lake is the evaluation of key educational policies at NUS, such as the recent NUS Lifelong Learners Programme, an ambitious initiative that renders student enrolment at NUS as valid for 20 years from the point of admission. The Data Lake allows NUS researchers to follow how its students and alumni learn throughout their lifetimes—not just during their university years. This is an unprecedented opportunity to optimise the long-term impact the university has on its students.


Data Governance and Ethics

NUS and ALSET are committed to ensuring that data in the Data Lake is protected and used ethically. Data management, access, and usage is governed by two bodies—the Learning, Analytics and Data Advisory Board and the Learning and Analytics Committee on Ethics (LACE). Their respective responsibilities are outlined in the chart below.


ALSET continually works with stakeholders from across NUS—including faculty, students, and advisors—to update its data ethics and data management policies and processes. Please contact us if you have any questions or comments about our data.


Data Operations

The ALSET Data Lake is co-managed by three different stakeholder groups: ALSET’s Discovery Research Unit, ALSET’s Translational Research Unit, and the IT Department at NUS. Their respective responsibilities are outlined in the chart below.


If you have any further questions about the Data Lake, please contact Kevin Hartman, ALSET’s Translational Research Coordinator.