Home Data Science Jobs - Bharat Domain Crunchers | Hiring | Senior Data Engineer – Python/ Java |...

Data Science Jobs - Bharat

Domain Crunchers | Hiring | Senior Data Engineer – Python/ Java | BigDataKB.com | 44831

By

-

28/09/2022

Job Location: Delhi

Technical Skills :

Languages – Python, SQL, Java, HCL, HTML/CSS/Javascript, Bash

Database Technology – Spark, SybaseIQ, DB/2, Snowflake, Redshift, Hive, Presto, Oracle PL/SQL

Tools – AWS, Terraform, Kubernetes, Docker, Jupyter, Intellij, vim, Git, SVN, Apache, nginx, Splunk, SSH

– Primarily should have worked on the Data Lake, a petabyte-scale Data Warehouse with unique requirements. The lake is used across hundreds of teams for many time-sensitive critical applications.

– Derive a variety of SLOs and health indicators for the lake. Successfully optimized the lake, bringing ingestion time down under 15 minutes for more than 90% of users.

– Designed an event-driven near real-time SLO monitor for the lake that processes millions of events a minute.

– Craft terraform AWS configurations from scratch to deploy key lake components to the cloud.

– Develop and maintain a Jupyter notebook ecosystem on Kubernetes to support the SRE team.

– Write Jupyter notebooks to analyze telemetry metrics, develop insights, and establish SLOs. Notebooks typically pulled in data using SQL or Pyspark, and further processed in Pandas. Visualizations were done using matplotlib.

– Designed an automation framework for Jupyter notebooks to schedule, cache, serve, and email them to clients.

– Implemented and maintained Prometheus metrics for high-level monitoring of the lake. These metrics are pushed to Grafana for visualization and Pagerduty for alerting.

– Developed on Facebook’s Hadoop system through Hive and Presto, using Facebook’s internal ETL framework.

– Maintained solutions with third parties for ad data ingestion and delivery, including coordination of data definitions and validation checks during ETL process.

– Created APIs using hack (PHP) for upload endpoints.

– Developed dashboards for sales lift data normalized across third parties using Tableau and internal tools.

– Maintained ETL processes to solve bugs, data quality issues, CPU and space optimization, and adding columns to tables, which were mainly core ad metrics data sets that had a wide impact across the company.

– Developed Facebook status tables which was a dataset that exceeded 150TB and over 1.2 trillion rows, from Facebook’s graph structure and curated into an easily digestible hive table, used by research teams for insights, sentiment analysis, and machine learning applications.

Submit CV To All Data Science Job Consultants Across Bharat For Free

🔍 Explore All Related ITSM Jobs Below! 🚀 ✅ Select your preferred "Job Category" in the Job Category Filter 🎯 🔎 Hit "Search" to find matching jobs 🔥 ➕ Click the "+" icon that appears just before the company name to see the Job Detail & Apply Link 📝💼

Job Id	Date Posted	MM/YY	Company	Type	Top Company	sTARTUP	Cosultancy ?	Industry Type	Location	Country	Estd Year	Role	Role Category	part time	Job Category	Tools	Exp.	Salary	Salary	-Ad-	Detail	Apply Link	Contact	HR Contact ?	Walkin	urgent	Career Gap	Female Preferred	Submit CV	Remote Jobs	Freelancer	TotalJobs	Certification Summ..	Masters	Trainer	TopCollege	ProdDev	InternExp	Onsite	research	certification
3	12/12/2024		Wipro	Top / Startup	Top Company	Startup		TV / Firm / Media / Entertainment	Mumbai	The Great Bharat (India)		Incident Manager	Unknown		Incident Mgmt.	ITIL	Pls See The Job Detail	Not disclosed	Other		Job Description Position Title - Incident Manager L2 Location - Airoli, Navi Mumbai During Major Incident End to end ownership of the Major Incident with the aim to minimize the time to restore. Coordination with all support engineers involved in the troubleshooting with the possibility to involve other support groups. Ensure a systematic process is followed for all Major Incidents from opening to closure. Assess Incident complexity and decide to apply the Crisis Management process if required: trigger and liaise with the Crisis Manager to manage communication with and escalation to Executive management. Track incident progress to resolution and responsible for communicating progress to higher management and transverse relevant teams. Prepare clear and concise comms and publish them based on the priority of the incident. Facilitate exchanges between the relevant support staff through setting up ad-hoc Conference Bridge / Business Chat groups and / or direct contacts. In particular, ensure diagnosis, action plans and key initiatives are shared with the relevant parties across technical towers and geographies. Drive discussions to a pragmatic outcome in a reasonable timeframe in order to ensure the key decisions are taken on time or escalated to higher management. As a last resort, responsible for choosing between resolution solutions, or escalating in case of disagreement between technical teams and after assessing all impacts with relevant stakeholders (Experts, Managers and / or Business line IT) Activate and manage the Incident WAR Room Regular Operations (no Major Incident) Coordinate the write up of the Incident Report, collecting inputs from all parties and ensuring high quality of the final document Organize Post-Incident Meeting to share / brainstorm on actions and suggestions to avoid similar incidents in the future or shorten resolution time. This involves writing and publishing organizational Incident Report which details organizational issues experienced during resolution and suggest improvements to speed up the team reactivity. Ensure consistency of the incident documentation / references across all systems Report KPI and analysis to senior management on key inputs linked to Major Incidents content and trends Write and publish the organizational Post-Mortem of Incident which details organizational issues experienced during resolution and suggested improvement areas to improve team interaction and fluidity Provide Consolidated Reports (Based on KPIs) for various committees. Ensure Procedures and Documentation is up-to-date (Diagrams, checklist etc.) Maintain the escalation tree (internal and with third parties) Share and promote best practices and lessons learnt on Major Incident management to internal stakeholders (support teams, management) Report to Operational Manager on the efficiency of the Major Incident Management process Act as Process manager ensuring that the Incident Management process is applied in a systematic and consistent manner by leveraging on lessons learnt. Key Responsibilities Technical Expertise Soft Skills Leadership with ability to do efficient multitasking Clear Verbal and writing Communication skills at multiple levels in an international context and should be able to follow organization process Client Oriented with Strong Relationship management with experience in IT Support and service Management French speaking will be a plus. Reactivity and ability to solve problems by using different methods Ability to learn and adapt and to be proactive Ability to work simultaneously with different profiles (technical/functional, medium/top management, etc.) Ability to work under pressure Requirement Infrastructure or Application support hands on experience of 4-6 years is mandatory Overall IT experience: 8 Years + Crisis Management IT Literacy: Knowledge of IT Ecosystem & Operations Background with experience in IT for Capital Market and Corporate Banking to the extent of knowing Front office to back office application flow Having knowledge about monitoring tools and scheduling jobs would be plus Good understanding of transversal technologies (SAN, Unix, Windows, Network, DBA, CTRL-M, MQ) Technologies and/or Applications background with ability to grasp impact and interdependencies Shift/Oncall & Weekend Support Shift Timings in IST: Mon to Fri 6:00 am to 09:30 pm OR 10:30 pm (2 shifts, 1st starts at 6:00 am IST and 2nd starts at 12:30 or 01:30 pm ï¿½ depending on DST in France) Weekday On-call ï¿½ From 09:30 pm OR 10:30 pm to following day 6:00 am Weekend On-Call ï¿½ From 09:30 pm OR 10:30 pm Friday to 6:00 am Monday Day light savings Period- Nov to Mar ( 1hour ahead) Oncall Support- To provide Oncall support services on rotational basis 09.30 PM to 6.00 am on weekdays after end of the evening shift work (Apr to Oct) 10.30 pm to 6.00 am on weekdays after end of the evening shift work (Nov to Mar) 09:30 pm OR 10:30 pm on Friday to 06:00 am Monday Weekend On-Call Support - based on high critical P1 incidents and P2 that have potential of becoming P1. Public Holiday Support ï¿½ Paris Holidays (Client Holiday) will be followed as public holidays and not India holidays Process & Tools ITSM tool- Service Now ï¿½ Ticketing tool for INC/SR/Change /Problem tickets Inhouse tools intermediate knowledge on ITIL Process & Practices Educational Qualifications Bachelorï¿½s degree or masterï¿½s degree in information/computer technology preferred Key Competencies: 8+yrs of relevant experience in Incident Management / Infrastructure / Application Support Experience of working in a banking environment. Customer service oriented, able to work in a dynamic and fast paced environment Excellent communication, interpersonal and troubleshooting skills Must be self-motivated and directed and demonstrate a keen attention to detail and possess the ability to work autonomously											1	Tools: ITIL, Monitoring Tools, Service Now			No
15	12/12/2024		Tata Consultancy Services	Top / Startup	Top Company	Startup		IT / Software Dev	Delhi	The Great Bharat (India)		Major Incident Manager_New Delhi	Unknown		Major Incident Manager		Pls See The Job Detail	Not disclosed	Other		Greetings from TCS!!!!!!! TCS Hiring for Major Incident Manager Job Location: New Delhi Experience Range: 8-12 Years Job Description : Handling major incident management process and drive end to end Sev 1 situation with effective Communication Experience as Major/Critical incident manager Participate and drive major incidents Effective handling of major incident bridge Periodic update about MIM progress through mails Drive the overall MIM journey. Publish Major incident report Problem management process drive											1				No
23	12/12/2024		Aon	Top	Top Company			BFSI / Fintech / NBFC	Gurugram	The Great Bharat (India)		Incident Specialistï¿½ Incident Management	Mid Level		Incident Mgmt.	ITIL	Pls See The Job Detail	Not disclosed	Other		Job Title - Incident Specialist-Incident Management Location- Gurugram/Noida To work in a 24X7 technology restoration and availability command center as an incident commander Technical Skills Required 3 - 6 years of experience working in the same role Experience with ITIL v3/v4, specifically in the area of Incident, Problem and Change Exposure to tools like WebEx Teams, SharePoint, Service Now Experience in managing and defining best practices based on the ITIL Incident framework Strong written and verbal communication and relationship management skills Deductive reasoning Command and Control presence while managing critical incidents Maintains a professional demeanor and attitude while being assertive Ability to challenge information if the response does not fit the situation Extensive problem solving, organizational and project management skills. Extensive experience with standard desktop tools, including Microsoft Office. Ability to handle multiple priorities, professionally manage time while meeting deadlines in a fast paced, high volume environment Soft Skills Required Should be able to drive restoration calls independently Attention to Detail Quality Focus Good analytical skills and be able to correlate the issues Excellent problem solving and resolution skills Promoting Process Improvement Flexibility to work in night shifts Self-starter; ability to begin working with minimal coaching. Strong communication skills. Provides information to others both internal and external to the organization. Good decision making Ability to work independently as well as part of a team on multiple overlapping issues. Primary Responsibilities Driving the restoration of high priority incidents engaging right resources at the right time Engage & drive restoration with required vendors Driving the calls with command and providing updates to the senior leadership Driving the efficiency and effectiveness of the incident management process Producing management information, including KPIs and reports Monitoring the effectiveness of incident management and making recommendations for improvement Driving, developing, managing and maintaining the major incident process and associated procedures Reviewing and auditing the incident management process Oversee and support senior business and IT managers in meeting established SLAï¿½s required to meet regulation standards Formal Education & Certification Bachelorï¿½s degree in computer science / information system or equivalent ITIL certified - Added advan 2552085											1	....g established SLAï¿½s required to meet regulation standards Formal Education & Certification Bachelorï¿½s degree in computer science / information system or equivalent ITIL certified - Added advan 2552085 .... Tools: ITIL, Service Now, SharePoint			No					Yes
27	12/12/2024		Tata Consultancy Services	Top / Startup	Top Company	Startup		IT / Software Dev	Chennai	The Great Bharat (India)		Incident Manager	Unknown		Incident Mgmt.		Pls See The Job Detail	Not disclosed	Other		Major Incident Manager Job Location: Chennai RESPONSIBILITIES Acting as a SPOC to provide the status update whenever a major incident occurs Ability to work in (24/7) shift and flexible schedule and extend for any business critical hours. Opening a Bridge through involving all relevant Resolver Groups and continue the discussions till the Major incident is resolved Informing the key stakeholders on the status of the Major incident and after getting the confirmed service restoration. Coordinating with the respective SMEs for speedy resolution of the Major Incident Ensuring the Major incident is resolved within the SLAs agreed with the Customer Taking all the preventive actions to minimize the service and business impact in case resolution time seems to be high. End to end understand if incident life cycle - Challenges, different priorities handling skills. Conducting a thorough analysis and preparing the Major Incident Report (MIR) for every Major Incident after it is closed. Ensuring that all the resolution procedures are updated in the knowledge database / Work log Conducting a review meeting with relevant members to identify the triggers for the Major Incidents, what caused them, and how to prevent such Incidents happening in future. Ensuring that the causes for all Major incidents are analyses and root cause is identified (through coordinating with problem Management process Coordinating with the process managers (Change manager, Problem Manager, capacity manager, Availability manager, IT Service continuity manager, etc.) on need basis to avoid reoccurring of the major incidents. Providing the periodical (monthly) reports on the overall status of the Major Incident Management Process. Maintain the SOP and get timely review and sign off from the customer. Perform Alert and Major Incident analysis (through coordinating with problem Management process)											1				No
31	12/12/2024		Netcore Cloud	Top / Startup	Top Company	Startup		IT / Software Dev	Mumbai	The Great Bharat (India)		Incident/ Application Head	President/ Head/ Director		Incident Mgmt.	AWS	Pls See The Job Detail	Not disclosed	Other		Job Title: Production/Incident & Application Support Manager Location: Thane Reports to: Sr VP Delivery head Department: Engineering ; Full-Time About us: At Netcore, innovation isnï¿½t just a buzzwordï¿½it's the core of everything we do. As the pioneering force behind the first and leading AI/ML-powered Customer Engagement and Experience Platform (CEE), we're dedicated to revolutionizing how B2C brands interact with their customers. Our state-of-the-art SaaS products are designed to foster personalized engagement throughout the entire customer journey, creating remarkable digital experiences for businesses of all sizes. Engineering at Netcore: Dive into a world where your work directly impacts engagement, conversions, revenue, and customer retention. Our engineering team tackles complex challenges that come with scaling high-performance systems. We thrive on versatility and speed, employing advanced tech stacks such as Kafka, Storm, RabbitMQ, Celery, RedisQ, and GoLang, all hosted robustly on AWS and GCP clouds. At Netcore, you're not just solving technical problemsï¿½you're setting industry benchmarks. Job Summary: We are seeking a seasoned leader for our SRE & Application Support division, overseeing the reliability, scalability, and efficient operation of our martech tools built on open-source frameworks. This role will play a key part in maintaining the operational stability of our products on Netcore Cloud's infrastructure, ensuring 24/7 availability, and driving incident management. The ideal candidate will combine strong leadership abilities with a deep understanding of site reliability, automation, performance monitoring, and application support, delivering world-class service to our clients and partners. Key Responsibilities: SRE Leadership & Strategy: - Lead the Site Reliability Engineering (SRE) team to design and implement robust systems ensuring uptime, scalability, and security. - Develop and maintain strategies for high availability, disaster recovery, and capacity planning of all Martech tools. - Advocate and apply the principles of automation to eliminate repetitive tasks and improve efficiency. - Establish and refine Service Level Objectives (SLOs), and Service Level Agreements (SLAs) in collaboration with product and engineering teams. Application Support: - Oversee and lead the Application Support Team responsible for maintaining the health and performance of customer-facing applications built on the NetcoreCloud platform. - Develop processes and Debugging procedures to ensure quick resolution of technical issues, and serve as an escalation point for critical incidents. - Ensure all incidents are triaged and handled efficiently, with proper root cause analysis and follow-up post-mortems for critical incidents. - Manage the implementation of monitoring tools and log management systems to detect, alert, and respond to potential issues proactively. Collaboration and Cross-Functional Leadership: - Work closely with Sales, CSM, Customer Support, development, QA, and DevOps teams. - Collaborate with stakeholders to drive a culture of continuous improvement by identifying and eliminating potential risks and issues in the system. - Be involved in PI (Program Increment) planning to align with product roadmaps, making sure reliability is factored into new feature development. Team Management & Development: - Recruit, mentor, and manage the SRE and Application Support Team, fostering a high-performance and collaborative environment. - Conduct regular performance reviews, provide feedback, and support professional development within the team. Innovation and Open-Source Contribution: - Lead initiatives to improve the open-source frameworks utilized in the martech stack, contributing to the open-source community as needed. - Stay current with emerging technologies, tools, and best practices in site reliability, automation, and application support. Requirements: Experience: - *8+ years of experience in SRE, DevOps, or Application Support roles, with at least 3 years in a leadership position.* - Proven track record of managing systems on open-source frameworks and cloud platforms such as NetcoreCloud or similar. - Demonstrated expertise in incident management, post-mortem analysis, and improving mean time to recovery (MTTR). - Strong experience in monitoring tools (Prometheus, Grafana, or similar), logging frameworks, and automation tools (Terraform, Ansible). Technical Skills: - Hands-on experience with Linux/Unix environments, cloud services (AWS, GCP, NetcoreCloud). - Proficiency in scripting and coding (Python, Php, Golang, Java, or similar languages) for automation purposes. - Solid understanding of CI/CD pipelines, version control (Git), and Alert & Application monitoring tools. Leadership & Soft Skills: - Proven leadership skills, with experience in team building, mentorship, and fostering a culture of accountability. - Strong interpersonal and communication skills, with the ability to interface effectively with technical and non-technical stakeholders. - Ability to manage multiple projects simultaneously, prioritize tasks, and work under pressure to meet deadlines. Preferred Qualifications: - Experience in the martech, Digital Marketing domain or working with large-scale, customer-facing SaaS applications. - Certification in SRE, DevOps, or cloud platforms (AWS, GCP). - Good application debugging skills, Product feature understanding skills. Why Join Us? - Be a part of an innovative and forward-thinking organization that values technology and continuous improvement. - Work with cutting-edge open-source frameworks and cloud technologies., SAAS Product. - Leadership opportunities with a direct impact on our customers and product success. Let's start a conversation and make magic happen together! Website - https://netcorecloud.com/											1	....h, Digital Marketing domain or working with large-scale, customer-facing SaaS applications. - Certification in SRE, DevOps, or cloud platforms (AWS, GCP). - Good application debugging skills, Product feature understanding skills. Why Join Us? - Be.... Tools: Ansible, AWS, Monitoring Tools			No					Yes

LEAVE A REPLY Cancel reply