Accelerating Application Development using the IUDX Sandbox

By: Rakshit Ramesh and Jyotirmoy Dutta

The India Urban Data Exchange (IUDX) has been designed to provide an efficient and seamless exchange of data generated by smart cities via open and standardised interfaces. IUDX helps to avoid the data exchange bottlenecks that arise due to the proprietary and ad-hoc nature of the interfaces and their implementations. The end goal of this data exchange is to let the stakeholders, i.e., the data providers and the application developers, to exchange data and create innovative data-driven solutions based on Artificial Intelligence and Machine Learning tools.

The data providers understand that there is an inherent value to their data, and they also know that this value can increase manifold when this data is actively used by the developer community. Getting the developer community interested and engaged with the data exchange platform is a critical factor in making any data exchange platform successful. Towards that, IUDX recently made available a sandbox platform which enables easy exploration of many valuable datasets available with the IUDX platform.   Just like the way a real-world sandbox is designed to prevent sand from being mixed with the surrounding soil, a data sandbox is designed to contain data in an isolated environment. A sandbox is an isolated testing environment that enables users to explore data sets, run programs, or open files without affecting the application, system, or platform on which they run. Also, like a real-world sandbox, a data sandbox allows the users to ‘play’ and experiment with their data. The sandboxes enable more agile data use, while also reducing the risk that comes with it. Data sandboxes are often used to facilitate ‘hackathons’ – in which the participants access specific data sets and prototype solutions – as a way to create solutions in a controlled environment.

IUDX enables the data providers with the capability to host their data and let it be used by external users albeit in a secure environment. The sandbox also contains pre-built notebooks that provide certain analytics to bring out insights from the data and also provides a quick and thorough feel of the data and the domain. The users can download sample historical data, explore IUDX curated notebooks of interesting use cases and interactively perform data analytics on these IUDX live/historical datasets. The idea of the sandbox at IUDX is based on three paradigms:

Connect: IUDX’s sandbox helps to connect the data generated by several Urban Local Bodies across the country to the users/consumers. These data files can be accessed safely and quickly by the user in a single place. This allows hackathon teams, academicians, researchers, data scientists, or anyone interested to explore the dataset and develop new solutions quickly. The users get to access a rich dataset from various organizations under one hood. The platform makes sure to mitigate any risk that comes with data sharing thus enabling experimentation and innovation.

Innovate: By allowing diverse datasets and use cases, the IUDX sandbox provides a platform for developing creative applications, powerful analytics, and new ideas. The use cases provided in the sandbox act as the starting point for any application. The IUDX sandbox allows the users to give wings to their imaginations. The use of datasets from fundamentally different datasets, for example flood sensor data, location data for buses, and air quality monitoring etc., can create ‘never seen before solutions.

Collaborate: Most of the data-driven solutions are slow as there are privacy bottlenecks as well as data storage issues. However, the IUDX sandbox provides an opportunity for the data providers to allow the users to use their rich dataset and share their solutions in a controlled environment. This leads to collaborative projects and multiple users can leverage the power of the shared data and shared working space to further the solutions. The sandbox enables the users to scale and productionalize their applications.

Figure -1: The paradigms of the IUDX sandbox

The key features of IUDX Sandbox are:

Easy to use: The IUDX sandbox has been designed in such a way that the users with little or no understanding of the IUDX platform can conveniently use it. The sandbox is provided with thorough explanations of the domain and the data.

Data Modelling Agnostic: The IUDX sandbox is free of data modelling constraints so the end-user can rapidly use the data of any variety, regardless of format, schema, or point of origin.

Data Protection: IUDX sandbox provides an isolated area for dedicated storage and processing of resources to make possible one-off exploratory initiatives and experimentation without affecting other users or systems. The provider can without any data security worries connect the data to the global ecosystem of innovators while ensuring regulatory compliance and privacy.

Cost-saving: The IUDX sandbox is cloud-based and lets one access the services free of cost. For the provider, this leads to lowering the cost of data-driven innovation by opening  data to external people, apps, and algorithms. For the developer, the platform gives free access to complex technologies, zero effort for accessing datasets, and a use-case to start; all these factors help in lowering the overall development cost.

Provides Fast Integration: IUDX sandbox enables the users to quickly integrate and aggregate data to determine how different sources relate to one another. For example, a dataset about the air quality monitoring data can try to find a correlation between the traffic in a city and air quality in some selected pockets. The IUDX sandbox environment helps in the development of predictive modeling, conduct situational analytics, and increase the accuracy of decisions.

Rich Application domains: Air Quality monitoring data, Solid Waste Management, Ambulance traffic, Surat Transit Management systems, and Street lighting are some of the data sets which could be explored in the IUDX sandbox. As the type of datasets and their reach increases it gives a way to safely welcome diverse perspectives which can transform the datasets into innovation-driven unique products.

Figure 2: Possibilities of IUDX sandbox

A simple schematic showing IUDX sandbox architecture is shown in Fig.3. IUDX sandbox platform offers a no-setup, customizable, Jupyter Notebooks environment. It gives free access to a huge repository of community-published data & code. The users can log into and register themselves through a username and a password. The user can get access to the repository of a rich data set and use cases along with the option to modify and download notebooks. In the near future, IUDX sandbox will add features like providing a forum for the users to participate in discussions, questions and answers, and participate in hackathons.

Figure 3: IUDX sandbox architecture

The philosophy of IUDX is “Unleashing the power of data for public good”. The IUDX data sandbox is the best way to unleash this power and enables the provider to open data to a wider community and at the same time allows the users the capability to explore datasets, and create their notebooks, and applications.