PySpark Data Engineer, Analytics
Job Description Summary
The Enterprise Analytics team at CCC has an open position for a Big Data Engineer. The team builds platforms to provide insights to internal and external clients of CCC businesses in auto property damage and repair, medical claims and telematics data. Our solutions include analytical applications against claim processing, workflow productivity, financial performance, client and consumer satisfaction, and industry benchmarks.
Our data engineers use big data technology to create best-in-industry analytics capability. This position is an opportunity to use Hadoop and Spark ecosystem tools and technology for micro-batch and streaming analytics. Data behaviors include ingestion, standardization, metadata management, business rule curation, data enhancement, and statistical computation against data sources that include relational, XML, JSON, streaming, REST API, and unstructured data. The role has responsibility to understand, prepare, process and analyze data to drive operational, analytical and strategic business decisions.
The Big Data Engineer will work closely with product owners, information engineers, data scientists, data modelers, infrastructure support and data governance positions. We look for engineers who start with a base of big data skills but who also love to learn new tools and techniques in a big data landscape that is endlessly changing.
* Build end to end data flows from sources to fully curated and enhanced data sets. This can include the effort to locate and analyze source data, create data flows to extract, profile, and store ingested data, define and build data cleansing and imputation, map to a common data model, transform to satisfy business rules and statistical computations, and validate data content.
* Produce data building blocks, data models, and data flows for varying client demands such as dimensional data, data feeds, dashboard reporting, and data science research & exploration
* Modify and maintain complex SQL and PL/SQL for Oracle ETL and BI/DW data flows * Produce automated tests of data flow components
* Use knowledge of the business to automate business-specific tests for data content quality
* Automate code deployment and promotion
* Build automated orchestration and error handling for use by production operation teams
* Provide technical expertise to diagnose errors from production support teams
* Collaborate with team members in an Agile team (e.g., Scrum)
* Participate as both leader and learner in team tasks for architecture, design and analysis
* Coordinate within collocated on-site teams as well as with work plans for off-shore resources
* Bachelor’s Degree or Two Year Technical Program with a Programming Specialization
* 3+ years’ experience with complex data flows
* Unix commands and scripting
* Hadoop fundamentals and architecture: HDFS, map-reduce, job performance
* Open source big data tools such as Hive, HBase, parquet, Spark SQL
* Advanced SQL for data profiling and data validation
* Programming in a language such as Python (preferred), Scala, etc.
* Familiar with open source monitoring and orchestration tools such as Ambari, Oozie
* Advanced transformations and statistical computations using Spark-SQL and Hive programming
* Experience with development of metadata-driven and fully parameterized data processing tools
* Programming in streaming-data tools such Spark-Scala, NIFI and Stream Analytics Manager
Why Choose CCC:
We promote a healthy work-life balance and offer generous benefit plans and resources designed with employee satisfaction in mind.
What we value is simple - customers, employee commitment, collaboration and clear communication.
We hire people who will embrace the company’s goals and productively contribute in ways that help us serve the customer, innovate, and stay strong.
We make it a priority to keep employees healthy, happy and enriched.
Our corporate headquarters is located in downtown Chicago within the historic Merchandise Mart—a certified LEED (Leadership in Energy and Environmental Design) building.
Please Note: Contingent Workers, Field Inventory Representatives and Interns are not eligible for the benefits above.
CCC Information Services was recognized by Forbes as one of America’s Best Mid-Sized Employers in 2018 and ranked #17 in the Top 100 Digital Companies in Chicago in 2017 by Built In Chicago.
CCC is ready to help you shift your career into high gear. Let's get started!
- Healthy - Wellness programs, competitive medical benefit offerings
- Happy – Recognition programs, a confidential employee assistance program, Perkspot/employee discount program and potentially flexible work arrangements such as staggered start times
- Enriched – Tuition reimbursement, training and learning programs, and leadership development opportunities