For over 10 years this company has pioneered a remote working strategy and while the head office is based in Ireland they currently have a globally distributed R&D team working from over 30 countries.
The team is passionate about scraping, web crawling, and data science and they are now a leading company for turning web content into useful data.
You'll join a team that is constantly improving its cloud-based web crawling platform, off-the-shelf datasets, and turn-key web scraping services.
As a Senior Engineer, your primary goal will be to develop and grow a new web crawling and extraction SaaS.
The new SaaS provides an API for automated e-commerce and article extraction from web pages using Machine Learning.
This is a distributed application written in Java, Scala, and Python where components communicate via Apache Kafka and HTTP. It is orchestrated using Kubernetes.
You will design and implement distributed systems: large-scale web crawling platform, integrating Deep Learning-based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments.
As this SaaS is still in the early stages of development, you will have a large impact on the system design!
DAY TO DAY:
Work on the core platform: develop and troubleshoot Kafka-based distributed application, write and change components implemented in Java, Scala, and Python.
Work on new features, including design and implementation. You will be responsible for the complete lifecycle of your features and code.
Solve distributed systems problems, such as scalability, transparency, failure handling, security, multi-tenancy.
WHAT WE LOVE TO SEE:
3+ years of experience building large scale data processing systems or high load services
Strong background in algorithms and data structures.
Strong track record in at least two of these technologies: Java, Scala, Python.
3+ years of experience with at least one of them.
Experience working with Linux and Docker.
BONUS POINTS FOR (or your opportunity to learn!)
Apache Kafka experience
Experience building event-driven architectures
Understanding of web browser internals
Good knowledge of at least one RDBMS.
Knowledge of today’s cloud provider offerings: GCP, Amazon AWS, etc.
Web data extraction experience: web crawling, web scraping.
Experience with web data processing tasks: finding similar items, mining data streams, link analysis, etc.
Job Types: Full-time, Permanent
Salary: €70,000.00-€85,000.00 per year