Job Location: Gurgaon/Gurugram
What you will do:
- Platform first approach to engineering problems.
- Design, build, maintain robust and automated pipelines to ingest and process structured and unstructured data from source systems into big data platforms using batch and streaming mechanisms leveraging cloud native toolset.
- Design, build, and maintain scalable and platformized infrastructure for machine learning pipelines at scale. Making it seamless to use Machine Learning Models for Data science and Engg team both.
- Strong engineering mindset – build automated monitoring, alerting, self healing (restartability/graceful failures) features while building the consumption pipelines.
- Create self-serve tools to improve the analytics platform.
- Build Data insights and alerts on top Business KPIs using dynamic thresholding.
- Build a rich Aggregate / Feature store to power in-house Data science Platform.
- Tools / Technologies – Kafka, Kafka Connectors, Debezium, Node/Typescript, React, Scala, Airflow, Snowflake, Redash, Looker, Kafka/Spark Streams, Spark, Sagemaker, EMR,Prometheus, Grafana.
What makes you a great fit:
- Experience with Data pipeline and workflow management tools like Luigi, Airflow etc.
- History and Familiarity of server-side development of APIs, databases, dev-ops and systems.
- Fanatic about building scalable, reliable data products.
- Experience with Big data tools: Hadoop, Kafka/Kinesis, Flume, etc. and knowledge of Scala is an added advantage.
- Experience with Relational SQL and NO SQL databases like MySQL, MongoDB etc.
- Experience with stream processing engines like Spark, Storm, etc. is an added advantage
- Advertisement -