SAMODATACUBEAI
Integrated Strategical and Operational Advanced Analytical tools to manage and control the water-energy-agriculture circular projects
B.4 Environmental protection, climate change adaptation and mitigation
B.4.2: Reduce municipal waste generation, promote source-separated collection and its optimal exploitation, in particular its organic component
Wastes Issues
-Climatological/environmental/Waste water issues

This workpackage analyse in India the main climatological /environmental issues on the impact of water resources at urban –rural scale serving as a starting point for understand of strategic approaches to solve them.In India there’s an increasing heterogeneity in the uneven distribution of water resources triggered by climate change,extreme water related events (floods and drought ) and increasing demand due to the population growth and economic development.Any water related engineering design should consider such context. In this work package,our goal is to extend the existing into a big data processing platform that combines satellite remote sensing data, observational data and from different types of sensors into a unified data cube at technological readiness level (TRL) 6. The data cube is the central interface between data and analytical methods in the desired big data processing platform; i.e., it is the input to the deep learning and data analytics components. The first objective is to develop an integrated management analytical tools data driven approach based on a Balance Score Card (BSC) that combine various water data sources. The second objective is to build the data cube. The third objective is to develop novel visualization methods that support users to assess the quality of the water data sources; e.g., to detect implausible data values, missing values, etc. We also plan to develop methods that enables users to explore summary statistics of the data cube; e.g., to assess the quality of the combined data, etc. The fourth objective is to develop interfaces and modules to ensure interoperability with the data analytics, deep learning, data storage components and with various data sources. The fifth objective is to build methods that automatically annotate and describe the raw data and the data cubes with meta-information.The workpackage addresses the important challenge of how to combine various heterogeneous data sources? We address this challenge by a) developing and implementing state-of-the-art algorithms for combining various data sources and b) developing and implementing a data cube. The project also addresses the second important challenge of big data: how to assess the quality of the input data, and how to assess the quality of the combined data and its impact to the final analytical result? We address this challenge by developing methods that support users to assess the quality of the water data sources and the combined data. We also plan to develop methods that enables users to explore summary statistics of the combined data. We complement these methods with interactive visualization to enable a rapid understanding of the data by users. The third challenge is to ensure interoperability of the Strategical Advanced Modelling Online (SAMO) platform with other crucial components of the desired big data processing platform. We address this challenge by specifying common interfaces and modules with all partners in the consortium. Finally, the project addresses the important challenge for big data platforms – the proper description of data with meta-information. We address this challenge with methods that automatically annotate raw data and cubes with suitable meta-data. We follow state-of-the-art meta-data concepts from big data communities and implement them in the desired big data processing platform.
Appendix
I see our role to build the core big data processing engine. We start to extend the AI-SAMO software to combine satellite image data with additional data sources. Furthermore, we define internal interfaces to deep learning and data analytics with the corresponding partners in the project. The deep learning and data analytics do not care about the processing and storage of data, they call specific modules or operators of the big data processing engine. The most important partners for us are the folks responsible for deep learning and data analytics. On top of these components, users can build their applications. To enable this, we need to define a pubic "application interface”. This allows the applications to use the platform.

To sum up, we do have two abstractions. The first are the internal interfaces between data and learning and analytics methods. I suggest to use a data cube that serves as this interface. This abstraction hides the actual data processing, managing compute jobs, etc. from the analysis algorithms.

The second abstraction is the public interface. Here we hide the internal implementation details like the data cube from users. Users just call analysis methods and select data sources. I suggest to have a visual “web-app” like we have in SAMO to support users in this task

The key benefits of using a platform include

1. Better Strategic District Waste Water Planning, Monitoring the engineering water related projects
The Balanced Scorecard in combination with AI-SAMO technology provides a powerful framework for building and communicating strategy. The model which helps managers to think about cause-and-effect relationships between the different strategic objectives. The process of creating a Strategy ensures that consensus is reached over a set of interrelated strategic objectives. It means that performance outcomes as well as key enablers or drivers of future performance are identified to create a complete picture of the strategy.

2. Improved Waste Water District Strategy Communication & Execution
Having a big picture of the strategy allows companies to easily communicate strategy internally and externally. We have known for a long time that a picture is worth a thousand words. This 'plan ' facilitates the understanding of the strategy and helps to engage staff and external stakeholders in the delivery and review of the strategy. The thing to remember is that it is difficult for people to help execute a strategy which they don’t fully understand.

3. Better Alignment of waste Water District Projects and Initiatives
The Balanced Scorecard helps organisations map their projects and initiatives to the different strategic objectives, which in turn ensures that the projects and initiatives are tightly focused on delivering the most strategic objectives.

4. Better Waste Water District Sharing Information and Management
The Balanced Scorecard approach helps organisations design key performance indicators for their various strategic objectives. This ensures that companies are measuring what actually matters..

5. Improved Waste Water District Performance & Reporting
The approach can be used to guide the design of performance reports and dashboards. This ensures that the management reporting focuses on the most important strategic issues and helps companies monitor the execution of their plan.

6. Better Organisational Alignment
The approach enables companies to better align their organisational structure with the strategic objectives. In order to execute a plan well, organisations need to ensure that all units and support functions are working towards the same goals. into those units will help to achieve that and link strategy to operations.

7. Better Process Alignment
Well implemented the approach also help to align organisational processes such as budgeting, risk management and analytics with the strategic priorities.



Objectives of the workpackage
This workpackage will focus in building an integrated ,balanced AI Data Cube for the
implementation of Strategical Advanced Modelling Online ( SAMO) leading to managing the water resources and support the other packages activities.
The final product will be a data visualisation and analytical AI integrated system into the water management and operations processes unable on real-time capture and delivering information and monitoring the activities of the water circular cycle.
Data needed for the project
List of current available data:


1. Monitor with Copernicus data (Sentinel1 radar, possibly also Sentinel3). Need to check there is regular coverage over India.
2. Use a processing workflow to do the following:
a. Filter out areas where soil moisture and standing water extent cannot be generating (based on land cover)
b. For areas where it is possible, employ algorithms on Sentinel-1 to evaluate soil moisture. St worst this could be a relative result (increase/decrease) but if there is modelling and/or in situ observations for calibration, it could be quantitative.
c. Sentinel-3 could be used for water level (lakes, large rivers etc), but this may not be necessary if water levels of monitored effectively with a gauge network

3-Satellite remote sensing images of EU from Landsat-5, Landsat-7, Landsat-8 and Sentinel-2 missions are available

List of other necessary data:
1. Users can download additional satellite images using the SAMO
2. Data from sensor networks

Description of work : BUILD DATACUBE FOR AI-SAMO PLATFORM
We plan to extend the existing SAMO platform into a big data processing platform that combines satellite remote sensing data, observational data from unmanned aerial vehicles and from different types of sensors. We follow a coordinated four-phase research and development model to a) extend the SAMO software and b) to build the desired big data processing platform. In the first phase, we specify the interfaces of a) methods that combines various data sources b) the data cube and the interfaces to other components c) methods to assess the quality of raw data and the combined data and d) methods to describe raw data and combined data. In the second phase, which comprises the main development phase, we focus on identifying, further developing and combining suitable methodological concepts to address the corresponding requirements of the methods. In the third phase, we focus on the implementation of the corresponding methodological concepts and their initial tests. Finally, in the fourth phase, we systematically validate our methods by applying them to real-world data sets.



Phase 1. Specification of interfaces
The first step in the specification of the interfaces is to precisely describe the requirements of each method. We specify the following, a) the inputs to each method, b) how a particular method processes the data, and c) the outputs of each method. This description is important because it ensures that our methods interoperate to with AI-SAMO and with other crucial components of the desired platform. In particular, the result of the discussions in b) is a detailed description of the methodological requirements of each method, which serves as guideline in the second phase, our main development phase. Our second step is to specify the technical requirements of integrating new methods with AI-SAMO and of integrating SAMO with the desired platform. The results are a precise description of the technical requirements of the methods and a list of modules that must be implemented to ensure interoperability with SAMO. The precise description includes a) what modules each method uses, b) the inputs to the modules, c) the processing and analysis steps performed by AI-SAMO, and d) the outputs of the modules. Finally, we specify the test data. This list serves as the gold standard of the work package and enables members of the consortium to test their methodological concepts and code.
Required inputs: none
Deliverables:
1. Document with the detailed description of the interfaces and conceptual requirements of the data science methods
2. Document with the detailed description of the technical requirements
3. List of test procedures and quality standards
Estimated timeframe: 2 months
Estimated men power: 2 software developers at full-time position

Phase 2. Methodological Development of methods
The second phase develops suitable methodological concepts for a) methods that combines various data sources b) the data cube and the interfaces to other components c) methods to assess the quality of raw data and the combined data and d) methods to describe raw data and combined data.
Required inputs:
1. Specifications of interfaces and conceptual requirements from phase 1
Deliverables:
1. Document with a detailed algorithmic description of a) methods that combines various data sources b) the data cube and the interfaces to other components c) methods to assess the quality of raw data and the combined data and d) methods to describe raw data and combined data
2. Document with worst case scenarios and the expected behavior of method under these worst case scenarios. This is important to ensure TRL 6.

Estimated timeframe: 2 months
Estimated men power: 2 software developers at full-time position
Phase 3. Implementation of methods
We implement the methods in the third phase. To finish this phase, we integrate the modules with AI-SAMO and tightly couple AI-SAMO with the other crucial components to build the desired big data processing platform. We integrate the methods by calling specific modules of the big data processing platforms. This step also includes the extension of AI-SAMO to ensure interoperability with the new methods and other crucial components.

Required inputs:
1. Methodological description of the methods from phase 2
Deliverables:
-Implementations of the methods
-Extensions of AI-SAMO with new methods and modules to ensure interoperability
-Coupling the crucial components data analytics, deep learning, data sources, AI-SAMO and cloud environment to build the desired big data processing platform.
Estimated timeframe: 2 months
Estimated men power: 2 software developers at full-time position

Phase 4. Validation and operational system
The first step is to test the methods using our test procedure and test data. We analyze the outcome of each test procedure in detail to check for defects or other problems. Our focus is to validate the stability of the big data processing platform by varying the input data. We also validate the correctness of the results of each computational run.

Required inputs:
Implemented methods from phase 3
Deliverables:
Big data processing platform at TRL 6
Estimated timeframe: months
Estimated men power: 2 software developers at full-time position
Project Deliverables & expected outcomes:
Dxxx.1 Specification of interfaces and conceptual requirements of the methods and the desired big data processing platform
Dxxx.2 Methodological concepts for a) methods that combines various data sources b) the data cube and the interfaces to other components c) methods to assess the quality of raw data and the combined data and d) methods to describe raw data and combined data
Dxxx.3 Implemented and tested methods for a) combing various data sources b) data cube c) assessing the quality of raw data and the combined data and d) describing raw data and combined data.
Dxxx.4 Big data processing platform at TRL 6



Oxford Sustainable Development Enterprise, (OXFORD-SDE) Italy– (formerly the
Economic Geography Research Group of University of Oxford) is an
innovative European Economic Interest Group (E.E.I.G.) registered in the Italy (2001)
Kingdom with headquarters in Oxford and London in partnerships with Universities
Oxford University Innovation, Imperial College and with the support of IBM (UK).
 
The OXFORD-SDE group has a wide range of highly qualified water experts and as
catalyst and a clearing house for cooperative efforts among member and
associated members.OxSDE is in the last 4 year has developed through a research programme a SAMO DATA CUBE AI platform to create innovative solution to water sustainable development.
OxSDE Italy is interested in partnerships to offer our Samo Platform DataCube AI as service or participate partner .The platform can be also adopted to any other call of the ENICBC programme.
Contact Prof .Stefano Bonfa s.bonfa@yahoo.co.uk
Euro 400
stefano
bonfa
oxsde
Other
Rome
Italy
s.bonfa@yahoo.co.uk
+442085242466 mob 00447529437976
stefano.bonfa2
on request
Approved