Principal Data Engineer
Job Description
At Discover, be part of a culture where diversity, teamwork and collaboration reign. Join a company that is just as employee-focused as it is on its customers and is consistently awarded for both. We’re all about people, and our employees are why Discover is a great place to work. Be the reason we help millions of consumers build a brighter financial future and achieve yours along the way with a rewarding career.
Responsible for using knowledge to apply a hands-on approach with next generation technologies to contribute to the team that delivers the latest data-driven platforms & next generation analytic technologies.
On top of that, as a principal data engineer on the Data Governance Technology team, you will provide engineering leadership on building reusable frameworks and products enabling data governance across our company’s data assets on premise and in the cloud. These products and frameworks will be utilized across many different data engineering teams enabling them to support our organization’s data needs. You will play a critical role to help create a single pane of glass that supports comprehensive and authoritative collection of enterprise metadata, and create enterprise level solution helping Discover to unlock the value of its data.
You will be on the cutting edge of finding and integrating new technology and tools for data centric projects. Some of the technologies you will use include Ab Initio DQE, Alation (Data Catalog), Spark(python/scala), Kinesis, Kafka, EMR, MPP Databases (Snowflake & Redshift), NoSQL DBs, and other various AWS services (Lambda, Glue, Step Functions..). Additionally, you will work on real time governance solutions using tools such as Kafka, Kinesis and Spark streaming and will deploy application code using CI/CD tools and techniques.
Responsibilities
- Provides senior-level technical consulting to peer-data engineers during design and development for highly complex and critical data projects.
- Provides engineering leadership to create and enhance data solutions enabling seamless integration and flow of data across the data ecosystem.
- Designs and develops data ingestion frameworks leveraging Open-Source tools such as NiFi, Sqoop, Hive, Java, Pig, Python, as well as data processing/transformation frameworks leveraging Open-Source tools.
- Provides support for deployed data applications and analytical models.
- Designs and develops real-time processing solutions using Open-Source tools.
- Drive product design and act as a technical liaison to support future capability development and integration efforts
- Designs and develops complex and critical data projects.
- Designs and develops real-time processing solutions.
- Creates and enhances data solutions that enable seamless integration and flow of data across the data ecosystem.
- Designs and develops data ingestion frameworks, leveraging open-source tools and data-processing frameworks.
- Develops, implements, and supports application codes and analytical models.
- Partners with the business and technology team to complete business initiatives in data governance space. Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs.
- Ability to quickly identify an opportunity and recommend possible technical solution in metadata management and data catalog.
- Follow Agile development approach for custom data pipeline development (Cloud and on prem).
- Utilize multiple development languages/tools such as Python, SPARK, Java, CI/CD pipeline, customized API calls to build prototypes and evaluate results for effectiveness and feasibility.
- Work heavily within the AWS ecosystem, using AWS services.
- Operationalize open source data-analytic tools for enterprise use.
- Ensure proper data governance policies are followed by implementing or validating Data Registry, Metadata Management, etc.
- Help coach and manage junior team members.
Minimum Qualifications
At a minimum, here’s what we need from you:
- Bachelor’s Degree in Computer Science or related field
- 6+ years of experience in Data Platform Administration/Engineering
Preferred Qualifications
If we had our say, we’d also look for:
- 8+ years of experience in Data Platform Administration/Engineering
- Excellent written and verbal communication, presentation and professional speaking skills
- Master’s degree in Computer Science, Information Technology, Operations Research, or related engineering field
- 5+ years of experience of being a lead engineer amongst a team of equal or junior level developers
- Development experience with programming languages such as Java, python and spark
- Development experience with developing REST API
- Strong in Unix/Shell scripting
- Deep level understanding and implementation experience across AWS Data Services such as Glue, Kinesis, SQS, Redshift, VPC, IAM, EC2, RDS, SNS, CloudWatch, CloudTrail, Step Functions, Lambda, EMR
- Experience in ETL / Data application development (preferably Ab Initio)
- A deep understanding of Enterprise Metadata Management and Data Catalog function and how cataloged data fuels safe and sound operations, and innovation
- Experience in data management and building data pipeline for cloud data assets
- Strong ability to build and leverage external relationships
- Decision making abilities while gathering information and then put your decisions into action
#LI- BG1
Discover Financial Services is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veteran status, among other things, or as a qualified individual with a disability.