Job Description
Note: Please don't apply if you do not have at least 3 years of Scrapy experience.
We are seeking a highly experienced Web Scraping Expert specialising in Scrapy-based web scraping and large-scale data extraction.
This role is focused on building and optimizing web crawlers, handling anti-scraping measures, and ensuring efficient data pipelines for structured data collection.
The ideal candidate will have 5+ years of hands-on experience developing Scrapy-based scraping solutions, implementing advanced evasion techniques, and managing high-volume web data extraction.
You will collaborate with a cross-functional team to design, implement, and optimize scalable scraping systems that deliver high-quality, structured data for critical business needs.
Key Responsibilities
Scrapy-based Web Scraping Development
- Develop and maintain scalable web crawlers using Scrapy to extract structured data from diverse sources.
- Optimize Scrapy spiders for efficiency, reliability, and speed while minimizing detection risks.
- Handle dynamic content using middlewares, browser-based scraping (Playwright/Selenium), and API integrations.
- Implement proxy rotation, user-agent switching, and CAPTCHA solving techniques to bypass anti-bot measures.
Advanced Anti-Scraping Evasion Techniques
- Utilize AI-driven approaches to adapt to bot detection and prevent blocks.
- Implement headless browser automation and request-mimicking strategies to mimic human behavior.
Data Processing & Pipeline Management
- Extract, clean, and structure large-scale web data into structured formats like JSON, CSV, and databases.
- Optimize Scrapy pipelines for high-speed data processing and storage in MongoDB, PostgreSQL, or cloud storage (AWS S3).
Code Quality & Performance Optimization
- Write clean, well-structured, and maintainable Python code for scraping solutions.
- Implement automated testing for data accuracy and scraper reliability.
- Continuously improve crawler efficiency by minimizing IP bans, request delays, and resource consumption.
Required Skills and Experience
Technical Expertise
- 5+ years of professional experience in Python development with a focus on web scraping.
- Proficiency in using Scrapy based scraping
- Strong understanding of HTML, CSS, JavaScript, and browser behavior.
- Experience with Docker will be a plus
- Expertise in handling APIs (RESTful and GraphQL) for data extraction.
- Proficiency in database systems like MongoDB, PostgreSQL
- Strong knowledge of version control systems like Git and collaboration platforms like GitHub.
Key Attributes
- Strong problem-solving and analytical skills, with a focus on efficient solutions for complex scraping challenges.
- Excellent communication skills, both written and verbal.
- A passion for data and a keen eye for detail
Why Join Us?
- Work on cutting-edge scraping technologies and AI-driven solutions.
- Collaborate with a team of talented professionals in a growth-driven environment.
- Opportunity to influence the development of data-driven business strategies through advanced scraping techniques.
- Competitive compensation and benefits.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Back End Developer
Employement Type: Full time
Contact Details:
Company: Leapstrat Interactive
Location(s): Kolkata
Keyskills:
Scrapy
GIT
Cicd Pipeline
Web Scraping
Python
Web Crawling
Data Extraction
Data Scraping
Regular Expressions
Docker Container