Data Engineer
As the world’s leading sports data and technology company, STATS powers sports. We are trusted by more than 800 clients around the globe to enhance fan engagement and maximize team performance by analyzing sports data from more than 100,000 games a year with unrivaled speed and accuracy. We power sports on and off the field through data feeds, video analysis, sports content and research, player tracking through STATS SportVU®, and customizable digital solutions.
For more information, go to www.stats.com and follow STATS on Twitter @STATS_Insights.
What You’ll Do:
• Contribute to all aspects of the data pipelines under the team’s ownership, which includes design, implementation, refactoring, automated testing, deployment, and uptime of the pipelines
• Select and integrate any Big Data tools and frameworks required to provide requested capabilities
• Implement, monitor, and optimize ETL processes
• Defining and implementing data retention and governance policies
• Improve internal team processes by keeping what's working, throwing away what's not
• Collaborate on the vision of the workflows under the team's ownership
What You’ll Bring:
• More than 4 years of relevant back-end software development experience
• Good diagnostic, analytical, design and communication skills
• Ability to craft simple and elegant solutions to complex problems
• Experience building and supporting real-time and non-real-time data pipelines via scripts, tools and services
• Experience working with Apache Spark, Hadoop v2, MapReduce or other open-source big data technologies
• Experience working with stream-processing systems, using solutions such as Apache Storm or Spark Streaming
• Experience building event-driven or message-driven architecture using tools like RabbitMQ and Kafka
• Experience with managing cloud (AWS, Azure, Google) platforms, via CLI and web tooling
• Knowledge of various ETL techniques and frameworks, such as Flume
• Strong understanding of Lambda Architecture, along with its advantages and drawbacks
• Familiarity with Oracle or other relational databases
• Familiarity with MongoDb, Redis or other NoSQL databases
• Experience with Python scripting
• Experience with SOA architectures and distributed systems
• Experience with Agile development processes
• BS or MS degree in Computer Science or related experience
Bonus Skills:
• Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
• Experience with Atlassian suite of tools (JIRA, Confluence, Bitbucket)
• Experience with CI/CD workflows and tools like Jenkins, CircleCI, and/or AppVeyor
• Experience with Node.js microservice development