Job Location: Bangalore/Bengaluru
- Education: Bachelors degree
Candidate should be able to:
- Coordinate Development, Integration, and Production deployments.
- Optimize Spark code, Impala queries, and Hive partitioning strategy for better scalability, reliability, and performance.
- Build applications using Maven, SBT and integrated with continuous integration servers like Jenkins to build jobs.
- Execute Hadoop ecosystem and Applications through Apache HUE
Build Machine Learning Algorithms using Spark. - Perform migration from Legacy Databases RDBMS to Hadoop Ecosystem
- Create mapping documents to outline data flow from source to target.
- Use Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster
- Design and deploy enterprise-wide scalable operations
Work on leading BI technologies like MSTR, Tableau over Hadoop Ecosystem through ODBC/JDBC connection - Perform Performance tuning of Impala queries
- Work on hive performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive, and Map Side joins.
- Create various database objects like tables, views, functions, and triggers using SQL
- Understand business needs, analyze functional specifications and map those to development and designing Apache Spark programs and algorithms.
- Install, configure, and use Hadoop components like Spark, Spark Job server, Spark Thrift server, Phoenix on HBase, Flume, Sqoop
- Write SPARK jobs to fetch large data volumes from the source
preparing technical specifications, analyzing functional specs, development, and maintenance of code - Develop end to end data pipeline using Spark, Hive, and Impala
- Design and documented operational problems by following standards and procedures using software reporting tool JIRA
- Use Rest services to access HBASE data and used the data for further processing in the downstream systems
- Data wrangling and creating workable datasets and work on different file formats like Parquet, ORC, Sequence files, and different serialization formats like Avro
- Perform Feasibility Analysis (For the deliverables) – Evaluating the feasibility of the requirements against complexity and timelines.
- complete the full lifecycle of software development and deliver on time
- work with end-users to gather requirements and convert them to working documents
- interface with various solution/business areas to understand the requirements and prepare documentation to support development
work in a fast-paced, team-oriented environment
Candidate should have:
- Experience in Hadoop, HBase, MongoDB, or other NoSQL platforms
- Experience with Spark and Spark SQL
- Excellent communication skills with both Technical and Business audience
- hands-on experience in Java, Spark, Scala, AKKA, Hive, Maven/SBT, Amazon S3
- Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
- Good experience in debugging issues using the Hadoop, Spark Log files
- Knowledge in Sqoop, Flume preferred
- Experience in Kafka, ReST services is a plus.
- Experience in Apache Phoenix, Text Search (Solr, ElasticSearch, CloudSearch)
- Expertise in Shell-Scripts, Cron Automation, and Regular Expressions
- 3+ years strong native SQL skills
- 1+ years experience with Hadoop, Hive, Impala, HBase, and related technologies, MapReduce/YARN, Lambda architectures, MPP shared-nothing database systems, and NoSQL systems
- 3+ years experience with Scala, Spark, Linux
- Hadoop
- HBase
- MongoDB
- NoSQL
- Spark
- Spark SQL
- Java
- AKKA
- Hive
- Maven/SBT
- Amazon S3
- Pig
- Hive
- Impala
- sqoop
- flume
Submit CV To All Data Science Job Consultants Across India For Free
๐ Explore All Related ITSM Jobs Below! ๐
โ
Select your preferred “Job Category” in the Job Category Filter ๐ฏ
๐ Hit “Search” to find matching jobs ๐ฅ
โ Click the “+” icon that appears just before the company name to see the Job Detail & Apply Link ๐๐ผ

Leave a Reply