Job Location: Indore
Roles and Responsibilities
- Above anything else we are looking for a team player who can work harmonically within the team.
- Design and implement the data crawling architecture and a large-scale crawling system.
- Develop not just large-scale scraping tools but also data integrity, health, and monitoring systems.
- Collaborate with data and analytics experts to understand / anticipate requirements and strive for greater functionality in our data gathering system.
- Should be apply and set to fetch data from multiple online sources, cleanse it and build APIs on top of it if required.
- Design, implement, and maintain various components of our data infrastructure (Scrapping data from Multiple sources and store it in formatted manner).
- Web crawlers / scrapers for a wide variety of alternative datasets.
- Data extractors for PDF, Images, and various other kinds of raw documents
- Third Party Data Integrations if required. Incrementally improve the quality of our offerings.
- Author tests to validate data availability and integrity
- Recommend and sometimes implement ways to improve data reliability, efficiency, and quality of data gathering processes.
- Work well in the team with little supervision to research and test innovative solutions.
Desired Candidate Profile
- 2+ years of web crawling/ scraping experience is recommended.
- A solid foundation in computer science, with strong proficiencies in data structures, algorithms, and software design and how they impact the efficiency of the system in real world.
- A good 2 years of software experience in Python, Shell Scripting, Databases.
- Expert in Crawling & scraping using any libraries such as Scrapy, Beautiful-Soup, Selenium.
- Ability to inspect and understand the source code of a web page.
- Sound knowledge of XPATH and HTML DOM
- Expertise with techniques and tools for crawling, extracting, and processing data (e.g., Scrapy, pandas, SQL, Beautiful-Soup, Selenium web-driver, Requests, etc).
- Expertise in extracting data from multiple disparate sources including Web, PDF, and images etc.
- Experience in browser development tools like chrome developer tool is a plus.
- Expertise in bypassing Bot Detection Techniques
- Expertise in using HTTP Proxy techniques to protects web scrapers against site ban, IP leak, browser crash, CAPTCHA, and proxy failure.
- Experience with Unit Testing.
- Experience with version control systems such as SVN or Git
- Proficiency with Linux/Unix
- Experience with designing and configuring cloud infrastructures
- Experience containerizing workloads with Docker
- Web RestFul APIs / Microservices Development Experience
- Good knowledge of distributed technologies, real-time systems of high throughput, low latency, and highly scalable.
Perks and Benefits
* Employee Discounts.
* Paid sick days.
* Perfomance Bonus
Submit CV To All Data Science Job Consultants Across India For Free
🔍 Explore All Related ITSM Jobs Below! 🚀
✅ Select your preferred "Job Category" in the Job Category Filter 🎯
🔎 Hit "Search" to find matching jobs 🔥
➕ Click the "+" icon that appears just before the company name to see the Job Detail & Apply Link 📝💼

