Apache Spark and Hadoop are emerging as a standard platform for data analysis in genomics. Atgenomix harnesses these advanced computing technologies as one complete enterprise-grade system to accelerate the best-practice NGS workflow by >10X. Our data parallelization seamlessly transforms conventional disk file workloads into efficient, scalable, and fault-tolerant workloads whilst empowering researchers to interact with the same data in the same ways. Atgenomix is spurring the future of parallel computing in genomics informatics.
The candidate will works with a team of natural-born data scientists and programmers, and is responsible for developing the workflow engine and data mining algorithms in Atgenomix SEQSLAB platform to automate and accelerate NGS analysis workflows based on cutting-edge big-data technologies, Apache Spark and Hadoop.
3+ years of experiences in large-scale software system development. Experiences in developing applications in Spark and Hadoop. Excel in scala, java, and python programming. NGS bioinformatics experience is a plus.