Scientist, Application Engineering
Cambridge, MA, United States
Scientist, Data and Application Engineering
Arena BioWorks, located in Cambridge, MA, is a new biomedical research institute focused on uncovering the mechanisms of human disease to identify opportunities for therapeutic intervention. The institute's private funding model allows Arena to advance novel therapeutics by forming and supporting biotech companies up to early clinical or Series B stage.
We are seeking a talented and motivated data scientist to join our growing AI/ML, Data Science, and Data Engineering team. In this role, you will implement database engineering necessary to integrate proprietary Arena data (e.g., cheminformatics, proteomics, functional genomics) with select public data resources. You will work collaboratively on back-end infrastructure, data integration (including API use and development), and data visualization to support scientific projects. You will also support integration between our chemistry and molecular biology electronic laboratory notebooks (ELN), including mastery of their APIs to support rapid prototyping of global and business analytic workflows. The qualifying candidate is expected to thrive in a multidisciplinary environment, coordinate with other research teams, and support internal projects.
Responsibilities:
Design, create, and manage database systems where internal and external data sources are integrated and allow scientists to investigate data independently.
Automate workflows for data retrieval or storage in electronic lab notebooks (e.g., Benchling, Signals).
Collaborate with teams of scientists to help implement and automate data-analysis pipelines and build data infrastructure.
Contribute to, and proactively communicate data-engineering standards, establish templates and frameworks, and ensure efficient use of cloud services and tools.
Communicate insights and progress to the broader organization.
Qualifications:
B.S. or M.S. in computer science (or equivalent training) and 8+ years of industry experience.
Extensive experience with database technologies, architecture, and management.
Programming languages such as Python and SQL are essential, and scientific computing experience (R, Matlab/Octave) is strongly encouraged; familiarity with additional languages like Java, C++, or Julia can be a plus.
Experience with data storage and management in the cloud: data lakes, warehouses, databases.
Ability to design and implement back-end models for complex scientific workflows and entities.
Adaptive, creative, and a quick learner; efficient in multi-tasking and troubleshooting.
Demonstrated ability to successfully work in cross-functional teams with an emphasis on teamwork, collaboration, and communication.
(preferred) NGS pipeline development experience with workflow frameworks such as Nextflow, Snakemake, Airflow, or AWS Step Functions.
(preferred) Experience with data-visualization tools, such as Tableau or Spotfire.
(preferred) Knowledge of various machine-learning algorithms and their applications in chemistry, drug discovery, and other areas.
Benefits:
Competitive salary and benefits package.
Opportunity to work on innovative scientific projects.
Collaborative and supportive work environment.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
#J-18808-Ljbffr