After a long break I would like to present the second part of our GSoC 2013 journey for the project MOST. After tackling the design issue the second major part was to build the Java framework which handle the Data Migration and provide the interface to access the newly created Cassandra database. I have summarized whole task into following points. I hope the information provided below will give you the basic idea about the framework. Feel free to ask any questions related to the framework.
After implementing the proposed Cassandra database design, we needed to migrate the huge amount of the sensor data to the newly created Cassandra database. For this task the framework provides the Data Migration class which supports methods to migrate data of all the sensors or you can specify the sensor name in the parameter to migrate the data of the specific sensor.
The migration class has the access to both MySQL and Cassandra database. It fetches/reads data from the MySQL database and sends it to Cassandra. The Cassandra interface first validates the data and then writes to the respected columnfamily of the sensor. There are certain conditions which needs to be fulfilled before writing the data to the database. The Validation module handle all these condition. You will come to know about it further in this article.
The Cassandra Interface
The Cassandra interface handles the read, write and delete operations for the Cassandra database. The class supports operations like reading latest value, reading values in given time range, deleting values from the given time range, deleting all the data values of the specified sensor, and writing data measurements to the columnfamilies. It also supports method to get the periodic values for the specified time range. The values are generated by the Periodic data generator class which is also part of the framework and details for the same will be given further in this article.
The purpose of storing these sensor measurements is to carry out further experiment on the collected data. So it is desired that the stored data must be useful for further experiment and does not lead to wrong outputs. There were certain conditions like the value of particular sensor must be in given range or the interval between two value has to be equal to or greater than equal to the predefined sample interval. The persist validator class provides methods to support these conditions.
Periodic Data Generations
It might be possible that sensor data is not accurate or to carry out certain experiment the periodic data is needed. In our project we also needed to get the periodic data. To fulfill this requirement the framework provides the periodic data generator class which supports methods to generate periodic data. Algorithms which uses concepts like weighted average linear interpolation are implemented in this class. Further details can be found from the research papers related to this project here and here.
So this is all about our GSoC 2013 journey for the project MOST. Looking forward to contribute further in this project. :) :)