Senior Machine Learning Engineer
Little Ferry, NJ, United States
Role Details:
Senior MLOps Engineer - Python
Intellectual Point is currently seeking a qualified MLOps Engineer to join our dynamic team. This role focuses on enhancing and maintaining state-of-the-art machine learning operations that support large language models (LLMs) across various platforms.
$100K-180K + Quarterly Bonuses (Performance Based)
️ Flexible PTO & 50% Coverage: Health, Vision, and Dental
Remote Only OR Hybrid OR Sterling, VA OR Reston, VA
Fully Funded IT Certifications & Internal Education
Accelerated Career Growth w/ Smaller Engineering Team
Work Issued - Latest M1 Macbook Pro
You will report to our SVP, Software Engineering.
Key Responsibilities:
On-Premise Architecture Development: Design and implement robust architecture using Nvidia GPUs for self-hosted LLMs, ensuring system resilience and efficiency.
Pipeline Optimization: Enhance ML pipelines with a focus on reducing latency and increasing concurrency, especially for Retrieval-Augmented Generation (RAG) systems.
Private Cloud Deployment: Execute deployments of LLMs on-premise on home-owned hardware DGX/HGX/XE9840, with an emphasis on scalability and hardware-efficiency.
MLOps Orchestration: Merge MLOps and DevOps principles to refine workflows, enhance monitoring, and improve system visibility.
Operational Efficiency: Apply MLOps and DevOps strategies to boost team productivity and streamline operations.
Deployment Automation: Seamlessly transition research models into deployment pipelines, ensuring smooth integrations.
Scalable Development: Work collaboratively to scale research code from the training phase to inference, optimizing performance.
Monitoring, Logging, and Governance: Maintain rigorous monitoring and logging practices to track system performance and adhere to strict governance standards.
️ Qualifications:
Programming Experience: At least 2 to 3 years of proficiency in Python and NodeJs.
Educational Background: Bachelor’s degree in Computer Science or a related field, or an equivalent professional experience.
MLOps/DevOps Experience: 3-5 years of experience in MLOps or DevOps roles, preferably within a production environment.
Infrastructure: Strong skills in IaaC, Docker, Kubernetes, and continuous integration/continuous deployment (CI/CD) pipelines.
Security and Compliance: Comprehensive understanding of security practices related to MLOps, including proactive data protection measures.
️ Preferred Skills:
Framework Proficiency: Familiarity with TensorFlow, PyTorch, and RAG systems.
Monitoring Tools: Experience with Prometheus, Grafana, or ELK Stack.
Agile Methodologies: Proficiency in Agile development practices.
Database Knowledge: Experience with various databases, including SQL, NoSQL, or data lakes.
Who you are:
You enjoy learning new things.
You love to ship code fast.
You strive to make things simple.
You have an eye for modern full-stack design.
You have incredibly high urgency & attention to detail.
You are clear communicator and can describe bottlenecks clearly.
You can comfortable in dealing with ambiguity and can make sure everyone is on the same page.
#J-18808-Ljbffr