Big data as a general concept is incredibly varied in its use cases and the conclusions it leads to. While it can be helpful in building out a single product, it can also be the basis of an entire business model. And oftentimes, regardless of what industry or purpose the data is used for, the quantitative vastness of a big data info-block is rivaled only by the qualitative minutia of the information presented.
We spoke to data experts at seven Chicago tech companies working with big data to find out just what they’re doing with the enormous sets of information they collect and how those efforts translate to tangible things in real life.
Relativity simplifies the discovery process during litigations, internal investigations and compliance projects with its cloud-based e-discovery software. Lead System Engineer Corey Wagehoft said the cloud is a key part of how his team leverages big data and keeps up with client demand.
How is your company leveraging big data as part of your product?
My team is primarily responsible for building a common shared compute platform for our development teams. This allows our SaaS product, RelativityOne, to scale to meet increased customer demand automatically. RelativityOne can process massive amounts of data, and we built this platform using widely adopted technologies that have been proven to handle the demand we require. We are also working with very bleeding-edge technology to open new opportunities for the developers building RelativityOne.
We can meet large-scale data demands on a much larger scale in the cloud than running our product in a traditional data center.”
What is an example of a real-world impact you’ve produced using big data?
We can meet large-scale data demands on a much larger scale in the cloud than running our product in a traditional data center. No matter what size of data set that needs processing, we can meet the demand with no interaction necessary from our customers or operations.
Discover Financial Services is an international provider of banking, payment solutions and other financial offerings designed to benefit consumers. Senior Vice President and Chief Data Officer Akshay Kumar said digital tools are key, but adept tech professionals are the most vital parts of working with big data successfully.
How is your company leveraging big data as part of your product?
Building an infrastructure to leverage big data is only half the battle. It’s not just about creating a data lake and moving data out to the cloud. We are building an analytics platform ecosystem that makes the experience of doing analytics much easier. We accomplish that by continuing to hire the best data scientists, data engineers and technologists, who have the freedom to explore their own ideas to help the end business users.
We are building an analytics platform ecosystem that makes the experience of doing analytics much easier.”
What is an example of a real-world impact you’ve produced using big data?
We are applying big data and AI to identify when a customer might be getting into financial trouble and addressing that before the account goes into delinquency and collections. Detecting patterns up front lets us be more proactive and engage customers earlier to steer them into a financial program that better meets their needs, lifestyle and spending behaviors.
Through detecting these patterns, on-track payments increased by more than 20 percent, which enabled customers to improve their credit standing. We also saw a 40-percent decrease in outbound calls to customers for late payments and, overall, improved the customer experience through better communication and customized programs.
Applying predictive analytics to the big data it collects, Arity provides insight to companies in the transportation industry in efforts to improve road safety. Senior Manager of Platform Engineering Kevin Glickley said arming the company’s data experts with the tools they need to extract insight is a tall order, but one that’s incredibly valuable to fulfill.
How is your company leveraging big data?
Our solutions leverage driving data and predictive analytics to achieve a single goal: making transportation smarter, safer and more useful for everyone. Our platform powers Arity by building and managing cloud data centers that will securely enable us to ingest, enrich, persist and process data at scale.
Our big data and analytics engineering team partners with our data science and analytic engineering communities to integrate state-of-the-art cloud tools into our ecosystem to process billions of miles of mobile trip data in a secure and efficient manner. We combine Amazon Web Service hardware along with open source and proprietary software to yield models that give our customers the insights they need.
We combine Amazon Web Service hardware along with open source and proprietary software to yield models that give our customers the insights they need.”
What is an example of a real-world impact you’ve produced using big data?
Optimizing the persistence of sensor data into our data lake by leveraging different geospatial partitioning methods and file format types helps our users process petabytes of data at scale. To remain profitable, we must manage costs through Ansible, Terraform and CloudFormation scripts so we may dynamically spin up and down compute servers as system demands require.
Enabling our data scientists and engineers with the tools necessary to create insights is no small feat. We accomplish this by ensuring we tune our massively parallel processing architecture to match the workload demands. We utilize our critical thinking skills to manage cloud infrastructure and enhance our analytic ecosystem with additional capabilities like GPU servers. By leveraging dynamic compute servers, we have helped expedite our analytic models — turning our data into insights.
Numerator helps clients, which include Nike, PepsiCo and Samsung, formulate an actionable picture of their customer purchasing processes. Engineering Lead Scott Ferguson said big data actually acts as the backbone of everything the company does.
How is your company leveraging big data as part of your product?
Numerator's product is big data. We collect paper and electronic receipts, online product price data and advertisements from a variety of mediums. Our product is collecting that disparate data and providing our clients with a means of consuming and understanding it.
Because we collect information that describes the path to a purchase as well as the purchase itself, we are able to paint a clear picture for our clients.”
What is an example of a real-world impact you’ve produced using big data?
Because we collect information that describes the path to a purchase as well as the purchase itself, we are able to paint a clear picture for our clients that describes what they are looking at or where they should be looking. For example, trends in our purchase data can be explained by observed events in our promotional data, or survey data can help to validate advertising strategy. Collecting data at scale is the foundation for being able to provide that level of insight to our users.
IMC Trading is a technology-driven trading firm that operates on over 100 international, regulated trading platforms. Senior Data Engineers Dave Evans and Zack Kobza provided some statistical and operational insight into how their team is optimizing massive amounts of data.
How is IMC leveraging big data?
Evans: Data’s importance here led to the creation of a data engineering team back in 2015. The data landscape has evolved from batch and T+1 processing using Hadoop to near real-time using Kafka and Cassandra to respond to rapidly changing market conditions. Analysis tools and reporting built on these platforms accelerate the discovery of new opportunities to keep us competitive in the markets.
We generate upwards of 15 terabytes of data per day including books and records, packet captures, analytics and metadata. We capture it for long-term storage in Hadoop clusters and process it with tools like Apache Spark, Hive and many other means of analysis. Big data, and the platforms associated with it, form a fundamental part of IMC’s daily operations.
We generate upwards of 15 terabytes of data per day including books and records, packet captures, analytics and metadata.”
What is an example of a real-world impact you’ve produced using big data?
Kobza: We recently began leveraging Kafka to perform near real-time streaming data analysis to compute live updates to our trading parameters, which take better advantage of changing market conditions. This enables our quants to build streaming, data-driven visualizations and tools to make much faster responses to our trading that, in the past, would have been done slowly and applied retroactively.
Pareto Intelligence helps healthcare providers improve their financial performance through data-based solutions that include revenue management, risk-adjustment and more. Director of Solutions and Analytics Zain Jafri said big data has been a key part of a new product that predictively assess a patient’s risk for needing eventual treatment.
How is your company leveraging big data as part of your product?
Given our focus on the healthcare industry, big data continues to be a critical part of what we offer our customers. The healthcare industry generates massive volumes of data daily. However, its full promise is yet to be realized due to messy data, incompatible systems and other common data issues. We help healthcare companies organize and make sense of their data and provide advanced analytics to generate business insights and intelligence. Our goal is to help the healthcare industry utilize big data to help improve outcomes and overall experience for insurers, providers and consumers.
Our goal is to help the healthcare industry utilize big data to help improve outcomes and overall experience for insurers, providers and consumers.”
What is an example of a real-world impact you’ve produced using big data?
One of the most exciting capabilities we’ve developed is our Clone Analysis. Its goal is to identify new members likely to have a high health risk so our clients can encourage preventative measures. The analysis is similar to a recommender algorithm where new members are compared to historical members to determine cost and health risk. This machine learning model is iteratively updated every quarter to track the progress of the new members. Without big data, we would not have the predictive power to confidently provide these Clone Analysis outcomes for our clients.
Through its communications and education tools, Edovo works with jails and prisons to help them better serve their incarcerated populations and operate more efficiently. Service Design Lead John Timpone said in a number of ways, big data aids in reducing inmate sentences and difficulties in readjusting to civilian life.
How is your company leveraging big data as part of your product?
Gathering and analyzing data at scale is critical to helping us understand how to optimize an individual’s time during their incarceration, as well as understanding what efforts are most effective when trying to prepare individuals for re-entry into their communities. We're currently operating in over 40 states across the country, and aggregating data on a national level can allow us to analyze how regional factors affect program success and engagement. Understanding user behavior helps us determine which content is most effective, and interaction patterns provide valuable insight into how to best support learners within jails and prisons.
We have been able to help incarcerated individuals successfully display rehabilitative progress to judges and parole officers.”
What is an example of a real-world impact you’ve produced using big data?
Using course completion metrics, transcripts and user responses, we have been able to help incarcerated individuals successfully display rehabilitative progress to judges and parole officers — ultimately leading to sentence reduction and an increase in good behavior credits. Our tablets also serve as a tool to help gather data and support research initiatives like mental health research, health literacy programs and university studies.
As we continue to grow and gather more data, sentiment analysis across communications and learner engagement with courses will allow us to determine key risk factors and identify when someone may be a risk for violence or suicide within a facility. It can also help us identify individuals who are strongly motivated to rehabilitate so that we can accelerate their progress and path to re-entry.