Harsha Reddy
Details
Information Technology
University of Central Missouri
2016 : 2017
Fidelity Investments
Sr Cybersecurity Container Engineer/Sr Data Engineer/AWS Cloud Engineer/Python at Fidelity
• Interacting with product owner and end users to understand the business requirements and prepare detail steps in Jira that encompasses all the requirements.
• Worked for container security and vulnerability teams on Qualys, Kibana, Divvy cloud, ECS, EKS, Docker, Python, Oracle, celery, containers.
• Worked on the requirements on Qualys to hit the API and change the existing infrastructure and get the data into reporting and enforcement views.
• Worked on the existing python scripts to achieve different requirements using Oracle and Python.
• Developed python script to pull the data from Oracle database and completed analysis to get the image count in different windows.
• Developed python script to pull data from API and insert that data into Oracle database.
• Integrated Qualys logs into Kibana by coordinating with the concerned team, which has helped team to get logs which are available in the server into Kibana.
• Worked on gathering all the requirements for Divvy data with product, team lead and data architect.
• Coordinated with Divvy team to request access to the service account and for other related information regarding Divvy cloud.
• Developed a python script to hit the Divvy API and get the data for different resource types into tables in Oracle database.
• Integrated celery scheduler with python code and scheduled Divvy jobs using that.
• Extracted different attributes for the Divvy source tables based on the requirements that discussed initially.
• Successfully created integration tables from the source tables to pull the container and pod information for kubernetes platform and published into advisory portal.
• Worked on the analytics to pull the data that extracted from the Qualys in to the Tableau dashboards.
• Worked on Redshift, Aurora, Databricks, Pyspark, Spark SQL.
2020 : 2021
Compunnel
Data Engineer/AWS Cloud Engineer/PySpark/Python
• Designing and building of generic data pipelines and distributed data processing framework using distributed computing architectures such as AWS services (EC2, EMR, Elastic search),Spark, Hadoop and Python.
• Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
• Performing Spark Stream processing to get data into in-memory, implement Resilient Distributed Dataset transformations, and perform actions.
• Involving in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyze them by running Hive queries
• Using Spark-SQL to Load JSON data, create Schema Resilient Distributed Dataset, load it into Hive Tables and handle structured data using Spark SQL.
• Designing highly scalable and fault tolerant, highly available and secured, distributed computing services using EC2 instances, EMR, EBS, S3, RDS, Auto Scaling, Lambda, SNS, Cloud Formation, Snowflake, Redshift etc. in AWS Cloud.
• Design, Development and support of various business applications using JAVA, SCALA, UNIX and Big Data technologies like Spark, AWS, HADOOP etc.
• Creating python scripts and integrate with boto3 module for developing services in AWS like S3, EC2, Lambda, EMR and VPC etc. And automate the EC2 instances in cluster for starting and stopping using boto3.
• Developing a python Script to stop all the instances with a specific tag in AWS instance using Lambda functions and made it into Cloud Watch Scheduler to schedule it every day.
• Designing and creation of complete Cloud Formation Templates (Json/YML Template) to implement the whole AWS infrastructure through scripting.
• Worked on setting up the infrastructure for the backend application(Spring boot) in AWS using the services like ALB, ECS, SNS, Target group, Route53, EC2, Security groups, IAM, RDS with the help of Docker.
2019 : 2020
Wipro Limited
Data Engineer/AWS Cloud Engineer/PySpark/Python/Hadoop at CapitalOne
• Worked on developing Turing SDK API for Tokenizing the Non-public personal information data and publishing that to users using Java, using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
• Developed generic script and solution for different business applications for typecasting the historical data into new datatypes using pyspark, Spark Context, RDD.
• Worked on developing the complex Shell scripts which is used for unloading the data from Teradata and removing the junk characters in that and which will be placed in Hadoop cluster based on the schema available in the database.
• Worked on major AWS Services like EC2, S3, Lambda, SNS, Security Groups, IAM Role, Cloud Watch, Cloud Formation, EMR, S3 Glacier, Key Management Service, SQS, Auto Scaling.
• Created Lambda Functions to trigger EC2 instance through which all the required installation set up is done in EC2 and calling the python script from the lambda.
• Created SNS topic to publish an Email to users through python script when there is delay and failure of the source files.
• Worked on Rehydration of the EC2 and EMR using Cloud Formation Templates once the AMI expires after 45 days.
• Worked on the Cloud Watch rule to create trigger for the Lambda by giving the Cron Expression as the schedule.
• Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
• Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
• Creating Hive tables and working on them using Hive QL.
• Managed and reviewed Hadoop log files.
• Tested raw data and executed performance scripts.
• Shared responsibility for administration of Hadoop, Hive and Pig.
• Manage and maintain Hadoop clusters for uninterrupted job
• Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
2017 : 2019
Capital One
Data Engineer/AWS Cloud Engineer/PySpark/Python/Hadoop at CapitalOne
2013 : 2015
HCL Technologies
Associate Software Engineer/Hadoop Engineer/Production Support Engineer at Standard Charted Bank
Skills
Amazon Web Services (AWS), Apache Spark, AWS Lambda, Big Data, C, Core Java, Data Engineering, Data Migration, Docker Products, EPM, ERP, Extract, Transform, Load (ETL), Hadoop, Hive, HTML, HTML5, J2EE, Java, microsoft excel, microsoft office, microsoft word, Oracle, Oracle SQL Developer, Performance Tuning, Putty, Python (Programming Language), Requirements Analysis, RESTful WebServices, Shell Scripting, SOAP Webservice, Software Development, Software Project Management, Spring Framework, Spring MVC, SQL, Testing, Unix, User Acceptance Testing, Web Services, Pharmaceutical Industry, HPLC, Pharmacovigilance, Pharmacokinetics, Pharmaceutics, Life Sciences, GMP, Clinical Trials, SOP, Research, microsoft powerpoint, quality assurance, standard operating procedure (sop), learning quickly, committed to professionalism, team leadership, engaging speaker, strategic thinking, Committed to, brand management, matlab, public speaking, Engineering, powerpoint, business strategy, data analysis, enterprise resource planning (erp), business analysis, business intelligence, statistical modeling, customer service, management, sponsorship
About
Harsha has a unique combination of academic background in Computer Science. He is a certified AWS Developer Associate and certified Apache Spark with Databricks and strong technical background in Python, Apache Spark, PySpark, AWS, SQL, UNIX, Hadoop, Java. He has 8+ years of experience in Data Engineer/AWS Cloud Engineer/PySpark/Python/Hadoop.
Currently, he is serving as a Sr Cybersecurity Container Engineer for Fidelity Investments.