Big Data Engineer

The Goal, Inc. is actively seeking a Big Data Engineer to join our team in Washington, DC.
 

RESPONSIBILITIES:

  • Provide Architecture, develop, implement, and test data processing pipelines, and data mining/data science algorithms on a variety of hosted settings
  • Assist customers with translating comples business analytics requirements into technical solutions and recommendations across diverse environments
  • Experience defining and implementing data ingestion and transformation methodologies, including between classified and unclassified sources
  • Communicate results and educate others through design and development of insightful visualizations, reports, and presentations
  • Participate in the design, implementation and support of Big Data, Analytics and Cloud solutions through participation in all stages of development lifecycle
  • Conduct regular peer code reviews to ensure code quality and compliance following best practices in the industry
  • Design, implement and optimize leading Big Data frameworks (Hadoop, Spark, SAP HANA) across hybrid hosting platforms (AWS, Azure, on-prem)
  • Review security requirements, analyze processes, and define security strategy to implement compliance and controls in line with organizational standards and industry best practices
  • Develop accreditation and security documentation, including Systems Security Plans (SSP) and Authorization to Operate (ATO) packages.
  • Provide thought leadership and innovation to provide recommendations on emerging technologies or optimization/efficiencies across architecture, implementation, hosting, etc.
  • Lead the planning, development, and execution of data onboarding processing capabilities and services for diverse customers and data sets?

 

REQUIREMENTS:

  • 10+ years of progressive experience in architecting, developing, and operating modular, efficient and scalable big data and analytics solutions
  • Fluency and demonstrated expertise across multiple programming languages, such as Python, Java, and C++, and the ability to pick up new languages and technologies quickly;
  • At least 5 years of experience with distributed computing frameworks, specifically Hadoop 2.0+ (YARN) and associated tools including Avro, Flume, Oozie, Sqoop, Zookeeper, etc.
  • Hands-on experience with Apache Hive, Apache Spark and its components (Streaming, SQL, MLLib)
  • Hands-on experience with data warehousing and business intelligence software including Cloudera and Pentaho
  • Experience developing data visualizations leveraging Tableau
Attach a resume file. Accepted file types are DOC, DOCX, PDF, HTML, and TXT.

We are uploading your application. It may take a few moments to read your resume. Please wait!