Senior Software Engineer for the Advanced BigData Analytics Service
Oracle Cloud Infrastructure (OCI) is leading the transformation to cloud-native Big Data technologies in our hyperscale, multi-tenant cloud deployed in more than 20 regions worldwide. OCI is committed to providing the best in cloud services that meet the needs of our customers who are tackling some of the world's biggest challenges. We offer unique opportunities for smart, hands-on engineers with the expertise and passion to solve difficult problems in distributed highly available services and virtualized infrastructure.
We re looking for an experienced engineer with expertise and passion in solving difficult problems in distributed systems and highly available services, who can help in the buildout, and subsequently further development, of an advanced Data Analytics Service, based on Apache Spark and cubing technologies. At OCI, you can help, shape, design and build innovative new systems from the ground up. These are exciting times in our space - we are growing fast, still at an early stage, and working on ambitious new initiatives. Senior engineers at any level can have significant technical and business impact.
From a technology perspective, this is a greenfield development environment with a huge amount of autonomy, leaving us free to build and innovate without being encumbered by legacy products and services. Our, the Advanced BigData Analytics Service, team s charter is to build a fully managed, cloud native service focused on large scale analytics on (mostly) unstructured data stored in data lakes, and management of the data in the data lakes. The service is based on Apache Spark, Kubernetes, and a number of OLAP extensions on top. The service work scope encompasses not just good integration with OCI s native infrastructure (security, cloud storage, etc.), but also deep integration with other relevant cloud native services in OCI (like Data Catalog). It includes doing cloud native ways of doing service level patching upgrades, and maintaining high availability of the service in the face of random failures planned downtimes in the underlying infrastructure (e.g., for things like patching the linux kernels to take care of a security vulnerability). Developing systems for monitoring and getting telemetry into the service s runtime characteristics and being able to take actions on the telemetry data is a part of the charter. The platform work involves deep technical deep dives into the various areas - query planning and optimization, query execution, etc., and making the SQL engine really work well in the cloud, and be able to use the cloud s infinite resources and yet be cost effective.
Work with members of the team and participate in the design and development of key features (e.g., cloud native SQL query optimizations) needed to make the BigData Analytics Service successful
Support a highly available and resilient cloud service, and build the supporting systems in/outside Apache Spark to be able to deliver on these
Contribute to the Apache Spark community as necessary
Experiment with the compute / storage / networking infrastructure as necessary and be able to recommend improvements in the way the service runs in various dimensions of performance, reliability, optimal cost, etc.
Be on top of what s happening in the Apache Spark and other related communities, and participate in community discussions, events, conferences, etc.
4+ years of experience in software development.
Strong knowledge of data structures, algorithms, distributed systems.
Proficiency in Spark development.
Experience building Services using Spark.
Strong programming skills in Java
Development experience using Cloud native infrastructures like Kubernetes
Experience with database internals and data management
Take initiative and be responsible for delivering complex software
Excellent problem solver, analytical thinker and quick learner
BS or MS in Computer Science or a related technical field
Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.