List of Topics:
Location Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

Exploring Data Mining for Hydrological Modelling

Exploring Data Mining for Hydrological Modelling

Trending PhD Thesis on Exploring Data Mining for Hydrological Modelling

Research Area:  Data Mining

Abstract:

   Technological advances in computer science, namely cloud computing and data mining, are reshaping the way the world looks at data. Data are becoming the drivers of discoveries and strategic developments. In environmental sciences, for instance, big volumes of information are produced by monitoring networks, satellites and model simulations and are processed to uncover hidden patterns, correlations and trends to, ultimately, support policy and decision making. Hydrologists, in particular, use models to simulate river discharges and estimate the concentration of pollutants as well as the risk of floods and droughts. The very first step of any hydrological modelling exercise consists of selecting an appropriate model. However, the choice is often made by the modeller based on his/her expertise rather than on the model-s suitability to reproduce the most important processes for the area under study.
   Since this approach defeats the "scientific method" for its lack of reproducibility and consistency across experts as well as locations, a shift towards a data-driven selection process is deemed necessary. This work presents the design, development and testing results of a completely novel data mining algorithm, called AMCA, able to automatically identify the most suitable model configurations for a given catchment, using minimum data requirements and an inventory of model structures. In the design phase a trans disciplinary approach was adopted, borrowing techniques from the fields of machine learning, signal processing and marketing. The algorithm was tested on the Severn at Plynlimon flume catchment, in the Plynlimon study area (Wales, UK). This area was selected because of its reliable measurements and the homogeneity of its soils and vegetation. The Framework for Understanding Structural Errors (FUSE) was used as sample model inventory, but the methodology can easily be adapted to others, including more sophisticated model structures. The model configuration problem, that the AMCA attempts to solve, can be categorised as "fully unsupervised" if there is no prior knowledge of interactions and relationships amongst observed data at a certain location and available model structures and parameters.
   Therefore, the first set of tests was run on a synthetic dataset to evaluate the algorithm-s performance against known outcomes. Most of the component of the synthetic model structure were clearly identified by the AMCA, which allowed to proceed with further testing using observed data. Using real observations, the AMCA efficiently selected the most suitable model structures and, when coupled with association rule mining techniques, could also identify optimal parameter ranges. The performance of the ensemble suggested by the combination of AMCA and association rules was calibrated and validated against four widely used models (Topmodel, ARNOVIC, PRMS and Sacramento). The ensemble configuration always returned the best average efficiency, characterised by the narrowest spread and, therefore, lowest uncertainty. As final application, the full set of FUSE models was used to predict the effect of land use changes on catchment flows.
   The predictive uncertainty improved significantly when the prior distributions of model structures and parameters were conditioned using the AMCA approach. It was also noticed that such improvement is due to constrains applied to both model and parameter space, however the parameter space seems to contribute more. These results confirm that a considerable part of the uncertainty in prediction is due to the definition of the prior choice of the model configuration and that more objective ways to constrain the prior using formal data-driven techniques are needed. AMCA is, however, a procedure that can only be applied to gauged catchment. Future experiments could test whether AMCA configurations could be regionalism or transferred to gauged catchments on the basis of catchment characteristics.

Name of the Researcher:  Vitolo, Claudia

Name of the Supervisor(s):  Buytaert, Wouter, Onof, Christian

Year of Completion:  2016

University:  Imperial College London

Thesis Link:   Home Page Url