close chevron-right chevron-down arrow-right arrow-left facebook linkedin instagram twitter email medium

Make a great move


Data Engineer



Software Engineering, Data Science
Canada · United States · Remote
Posted on Friday, June 7, 2024

About us:

Nomic was founded with the purpose of making biology easier to measure. To do this, we have untangled some of the most difficult problems in protein profiling. Our team is combining DNA nanotechnology, high-dimensional flow cytometry, laboratory automation, and machine learning to develop the world’s highest throughput proteomic platform: the nELISA.

Since spinning out of McGill University, we have partnered with and provided platform access to dozens of drug discovery groups including GSK, 4 of the top 10 pharmas, and leading biotechs. We have recently launched a state of the art manufacturing and protein profiling facility that enables multiplexed measurement of >2.5M samples a year, generating an effective 500M protein assays in the process.

We are building a diverse team of engineers, scientists, and world-changers. We like to break down difficult problems using a first principles approach, often leveraging the latest breakthroughs from across the scientific and technological spectrum to drive our mission forward.

Nomic is headquartered in Montreal, Canada with a satellite research lab in Boston, Massachusetts. The majority of our team is based in Montreal and works in a new, shared office/laboratory space with an in-person, but flexible work-from-home, policy.

About the role:

Our Data team’s mission is to build, operate and maintain the data infrastructure and data pipelines needed for analyzing nELISA data at scale. Strong fundamentals in data science and data engineering is key to our vision of the future, and every aspect of our company today is geared towards generating more useful proteomic data. In the lab we are scaling the nELISA to generate higher plex and lower cost proteomic data points, and outside the lab we help our users to best leverage nELISA data for their biological research.

Our data roadmap includes building robust pipelines for decoding nELISA datasets, generating advanced and application-specific bioinformatic pipelines to help customers understand their unique datasets, and developing improved internal-facing tools that will let us execute faster in the lab by extracting insights from our nELISA profiling and manufacturing QC data on-demand.

As a Data Engineer, you will play a critical, first-hand role in developing core improvements to the data pipelines and data infrastructure for handling all things nELISA data. In particular:

  • You will primarily be responsible for designing, building, iteratively improving, automating, deploying and scaling our data pipelines for processing flow cytometry data into quantitative protein measurements. This will be done in close collaboration with your Data Engineering and Software Engineering teammates.

  • You will support or lead the design and implementation of our data platform architecture, including data lakes and related infrastructure, and build and maintain data pipelines to extract, transform, and load (ETL) data from various sources into the data storage systems, ensuring data quality, reliability, and scalability.

  • You will also support R&D and Lab Operations teams through developing additional data support features and applications - i.e. the internal tooling needed to support the growth of Nomic going forward. This will include any new data analysis pipelines to analyze nELISA data, including QC data from our daily manufacturing and profiling operations.

  • This role will involve substantial communication, teamwork, and attention to detail, especially when identifying and troubleshooting issues related to nELISA data and ensuring we build the right tools, and the right abstractions, for our teammates and for customers.

  • When tooling does not yet exist, you will be responsible for analyzing nELISA data using our suite of decoding and analysis tools, as well as leveraging your technical and bioscience domain expertise to develop new data analysis pipelines when needed.

  • You will be relied on to support the R&D and Lab Operations teams with guidance on experimental design and analysis when needed.

What we’re looking for:

  • 4+ years of industry experience building data pipelines or machine learning pipelines from the ground up for bioscience data, or having developed equivalent technical proficiency in an academic setting.

  • 2+ years software engineering/development experience.

  • Strong Engineering or Applied Sciences background (or a related technical field).

  • Statistical skills including bayesian statistics, sampling methods, mixed models, and other statistical concepts.

  • Proficient in the fundamentals of modern biotechnology tools and their associated methods, in particular in at least one of: sequencing, immunoassays, nucleic acid amplification, DNA nanoarchitecture and design, separation-based techniques for biological samples and compounds, biophysics / fluorescence / FRET, and signal processing.

  • Working knowledge of surface chemistry, sandwich immunoassays and related lab techniques such as ELISA, protein expression/purification, SDS-PAGE/western blot, cell culture, cytometry, bead-based assays, and microscopy, and proteomic tools and methods, e.g. mass spectrometry, immunoassays, and/or separation-based techniques.

  • Experience working collaboratively on data science problems with wet lab scientists, ideally in a scaling startup.

  • Excellent communication skills (written, verbal, and in a codebase).

  • Experience interfacing with a Laboratory Information Management System (LIMS) and/or quality control (QC) and Quality Management System (QMS).

Join us if you:

  • Connect deeply with our mission, ambition and sense of duty. Our mission isn’t marketing flash: we developed our technology to better measure biology and discover biomarkers for early disease detection. We firmly believe we will be successful in literally eradicating certain diseases by enabling them to be diagnosed earlier. We also believe that our hard work to bring this technology to its full potential is our duty.

  • Are up for a challenge and want to grow: We are a team of problem-solvers, and we continually put ourselves to the test and go into the unknown. We have a growth mindset, both on hard and soft skills, and we rely on each other to give critical and candid feedback to ensure that we can all reach our full potential.

  • Want to be at the cutting-edge of biotechnology. The nELISA is a new tool that leverages DNA nanotechnology to generate proteomic data more efficiently than ever before. You get to design and build the data pipelines that will support the scaling of this technology going forward.

  • Love writing code and analyzing biological data, and want to be responsible for driving improvements to data pipelines from a full-stack perspective.

  • Prefer working and communicating within a diverse cross-functional team. You would get to interface with your teammates from the broader Engineering, Operations, and Commercial teams on a daily basis, joining a collaborative, diverse, and inclusive team where your ideas will be valued.

  • Want the responsibility of addressing some of our hardest problems. Data is one of our core competencies, and researching and developing improvements to the way we analyze data has a compounding benefit on all other aspects of our company and our customers, most notably the scientists using the nELISA and patients that will ultimately benefit from nELISA data.

If you are passionate about building data pipelines for biology, want to drive innovation in proteomics, and are eager to make a meaningful impact in the world, we invite you to apply and join us on our journey to redefine proteomics and the understanding of biology.