Ranganath HR

Speaker

Ranganath HR

Specialties
Performance Engineering (Testing and Tuning), HP LoadRunner, Performance Center, HP Diagnostics, Quality Center, Team Quest, Site Scope, Wireshark, HTTP Watch, PerfMon, rstat, nmon.
SAP Performance Testing and Engineering, Dynatrace

Title:Follow a Tweet - BigData Pipeline Testing

Abstract:

In this digital world, every second the social media generates huge terabytes of data. This data is consumed by so many companies to transform into opportunities. Big Data plays a vital role when handling such a large volume of data. Big Data deals with volume, velocity and variety of data. The data can be in any form be it a message, log, xml, sensor data, photos, images etc. using Big data tech stack processing of data and performing analytics will be very insightful. The daily Google Searches, Facebook messages, likes, Twitter tweets that are generated is phenomenal. Businesses are utilizing this information in numerous ways, managing and analyzing it to get a competitive edge.

There are different types of DataPipelines like Batch Processing, Stream Processing. So different strategies are required to perform testing on these data pipelines. We are considering a near real time (Stream processing)

Use case:

Let us consider a new Sporting Company would like to target selling of sport equipment based on the pulse of the sport in each country.

The Olympics is the most trending topic currently. We shall try to extract all the tweets from Twitter and Load it into our ecosystem. Using this data we shall build analytics to get the top sport per country and target based on demography the corresponding sports equipment and present a visualization.

As part of this talk,we shall Follow The Tweets in the entire data pipeline and eventually see how the tweet is being utilized & analyzed. We are going to provide a detailed test strategy on how a tweet on Twitter is extracted and transformed and loaded into reports. The mindset that is required in Data Pipeline testing will be highlighted.A high level testing strategy of Twitter Data Pipeline will be demonstrated.

Outline/Structure of the Demonstration

Introduction and Agenda - 5 mins

Big Data Pipeline & Testing Strategy - 10 mins

Demo with Follow the tweet from Twitter - 10 mins

Code walkthrough - 5 mins

Learning Outcome

● Understanding the Big Data pipelines

● Testing Mindset for BigData

Target Audience

Enthusiastic Testers who are interested in understanding testing aspects of BigData

Prerequisites for Attendees

Basic knowledge of data & sql