Sr. Machine Learning Engineer, Data Platform
Redwood City, CA, United States
Company Description
PubMatic is a publisher-focused sell-side platform for an open digital media future. Featuring leading omni-channel revenue automation technology for publishers and enterprise-grade programmatic tools for media buyers, PubMatic's publisher-first approach enables advertisers to access premium inventory at scale. Processing over 2 trillion+ ad impressions per month, PubMatic has created a global infrastructure to drive publisher monetization and control over their ad inventory.
Since 2006, PubMatic's focus on data and technology innovation has fueled the rise of the programmatic industry as a whole. Headquartered in Redwood City, California, PubMatic operates 13 offices and six data centers worldwide.
Job Description
PubMatic is seeking a Sr. Machine Learning Engineer with big data experience who can work on building the next generation ML platform. The ideal candidate is a self-motivated problem solver with a strong background in big data tech stack, software design and development.
If you get excited about building a highly impactful machine learning platform which processes large datasets, in a creative and fast-paced open cultured environment, then you should consider applying for this position.
Responsibilities
Build, design and implement our highly scalable, fault-tolerant, highly available big data platform to process terabytes of data and provide customers with in-depth analytics.
Developing Big Data pipelines using modern technology stack such as Spark, Hadoop, Kafka, HBase, Hive,etc .
Developing analytics application ground up using modern technology stack such as Java, Spring, Tomcat, Jenkins, REST APIs, JDBC, Amazon Web Services, Hibernate;
Building data pipeline to automate high-volume data collection and processing to provide real-time data analytics.
Work collaboratively with Machine Learning and monetization team to make democratize data for analysis and impact.
Build solutions for help monetization team to run experiments at a fast pace and analyse data accurately to calculate impact.
Have good understanding of the engineering tech stack and ML algorithms to make data processing jobs powering these algorithms more efficient and scalable.
Develop systems to objectively monitor the impact of various experimental changes on machine learning algorithms, clearly highlighting both positive and negative outcomes.
Managing Hadoop Map Reduce and Spark Jobs & solving any ongoing issues with operating the cluster;
Expertise in developing Implementation of professional software engineering best practices for the full software development life cycle, including coding standards, performing code reviews, committing to Github, preparing documents in Confluence, continuous delivery using Jenkins, automated testing, and operations.
Participate in Agile/Scrum processes such as Sprint Planning, Sprint Retrospective, Backlog grooming, User story management, work item prioritization, etc.
Keep in regular touch with quality engineering team which ensure the quality of the platforms/products and performance SLAs of Java based microservices and Spark based data pipeline.
Support customer issues over email or JIRA(bug tracking system), provide updates, patches to customers to fix the issues.
Discuss with Technical Writing team about the technical documents that are published on documentation portal.
Perform code and design reviews for code implemented by peers or as per the code review process.
Qualifications
3-5 years coding experience in Java,
Solid computer science fundamentals including data structure and algorithm design, and creation of architectural specifications.
Expertise in developing Implementation of professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.
A passion for developing and maintaining a high-quality code and test base, and enabling contributions from engineers across the team.
Expertise in big data technologies like Hadoop, Spark, Kafka, HBase etc would be an added advantage.
Experience in developing and delivering large scale big data pipelines, real-time systems & data warehouses would be preferred.
Demonstrated ability to achieve stretch goals in a very innovative and fast paced environment.
Demonstrated ability to learn new technologies quickly and independently.
Excellent verbal and written communication skills, especially in technical communications.
Strong inter-personal skills and a desire to work collaboratively.
Compensation And Benefits
Base Salary Range: $160,000 - $180,000
In accordance with applicable law, the above salary range provided is PubMatic’s reasonable estimate of the base salary for this role. The actual amount may vary, based on non-discriminatory factors such as location, experience, knowledge, skills and abilities. In addition to salary PubMatic also offers a bonus and a competitive benefits package.
Return to Office : PubMatic employees around the world have returned to our offices via a hybrid work schedule (3 days “in office” and 2 days “working remotely”) that is intended to maximize collaboration, innovation, and productivity among teams and across functions.
Benefits : Our benefits package includes the best of what leading organizations provide such as, paid leave programs, paid holidays, healthcare, dental and vision insurance, disability and life insurance, commuter benefits, physical and financial wellness programs, unlimited DTO in the US (that we actually require you to use!), reimbursement for mobile expenses, fully stocked pantries, and fresh catered lunches 5 days a week.
Diversity and Inclusion : PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
#J-18808-Ljbffr