• A grid enabled staging DBMS method for data Mapping, Matching & Loading

      Ahmed, Ejaz (University of BedfordshireUniversity of Bedfordshire, 2011)
      This thesis is concerned with the need to deal with data anomalies, inconsistencies and redundancies within the context of data integration in grids. A data Mapping, Matching and Loading (MML) process that is based on the Grid Staging Catalogue Service (MML-GSCATS) method is identified. In particular, the MML-GSCATS method consists of the development of two mathematical algorithms for the MML processes. Specifically it defines an intermediate data storage staging facility in order to process, upload and integrate data from various small to large size data repositories. With this in mind, it expands the integration notion of a database management system (DBMS) to include the MML-GSCATS method in traditional distributed and grid environments. The data mapping employed is in the form of value correspondences between source and target databases whilst data matching consolidates distinct catalogue schemas of federated databases to access information seamlessly. There is a need to deal with anomalies and inconsistencies in the grid, MML processes are applied using a healthcare case study with developed scenarios. These scenarios were used to test the MML-GSCATS method with the help of software prototyping toolkit. Testing has set benchmarks, performance, reliability and error detections (anomalies and redundancies). Cross-scenario data sets were formulated and results of scenarios were compared with benchmarking. These benchmarks help in comparing the MMLGSCATS methodology with traditional and current grid methods. Results from the testing and experiments demonstrate that the MML-GSCATS is a valid method for identifying data anomalies, inconsistencies and redundancies that are produced during loading. Testing results indicates the MML-GSCATS is better than traditional methods.