Challenges and Techniques for Testing of Big Data

Abstract Big Data, the new buzz word in the industry, is data that exceeds the processing and analytic capacity of conventional database systems within the time necessary to make them useful. With multiple data stores in abundant formats, billions of rows of data with hundreds of millions of data combinations and the urgent need of making best possible decisions, the challenge is big and the solution bigger, Big Data. Comes with it, new advances in computing technology together with its high performance analytics for simpler and faster processing of only relevant data to enable timely and accurate insights using data mining and predictive analytics, text mining, forecasting and optimization on complex data to continuously drive innovation and make the best possible decisions. While Big Data provides solutions to complex business problems like analyzing larger volumes of data than was previously possible to drive more precise answers, analyzing data in motion to capture opportunities that were previously lost, it poses bigger challenges in testing these scenarios. Testing such highly volatile data, which is more often than not unstructured generated from myriad sources such as web logs, radio frequency Id (RFID), sensors embedded in devices, GPS systems etc. and mostly clustered data for its accuracy, high availability, security requires specialization. One of the most challenging things for a tester is to keep pace with changing dynamics of the industry. While on most aspects of testing, the tester need not know the technical details behind the scene however this is where testing Big Data Technology is so different. A tester not only needs to be strong on testing fundamentals but also has to be equally aware of minute details in the architecture of the database designs to analyze several performance bottlenecks and other issues. Like in the example quoted above on In-Memory databases, a tester would need to know how the operating systems allocate and de-allocate memory and understand how much memory is being used at any given time. So, concluding, as the data- analytics Industry evolves further we would see the IT Testing Services getting closely aligned with the Database Engineering and the industry would need more skilled testing professional in this domain to grab the new opportunities.