Over 5,000.
That’s how many data scientists make up the Illinois’ workforce. Chicago employs 3,600 of them — and that number is growing.
With that growing number comes an equal number of unique daily routines.
While each day varies, every professional, including Caterpillar's Lead Data Scientist David Villero, begins their day with a universal ritual to prepare for their work ahead.
Built In Chicago sat down with Villero and IMC Trading Data Engineer Bo He to get a glimpse into their workday — beyond their morning coffee.
Cat Digital is the digital arm of Caterpillar Inc., responsible for bringing digital capabilities to its yellow iron.
Describe a typical day for you. What work do you tackle, who do you collaborate with and what tech do you use?
In my current role, I have the flexibility to work both from home and the office, though I am typically at home. I begin my mornings with a cup of coffee and a walk with my wife and dog, during which I listen to financial podcasts or audio articles on computer science and physics. I settle into work around 8:30 a.m. and try to wrap up by 5 p.m.
I prioritize and tackle my tasks based on the day’s requirements, the type of work and the effort needed to complete specific tasks or tickets. Each day is a little different.
Currently, my work focuses on two main areas: training reinforcement learning models for logistical decision-making in simulated environments, and exploring the applications of GenAI models, investigating new models and engaging in prompt engineering.
I frequently use Python and leverage AWS services like Sagemaker, ECS, S3 and Lambda to scale up training and prototype solutions. I also use Snowflake SQL for various tasks. For deployment, reproducibility purposes and running things on AWS, I try to develop everything in Docker containers.
Describe a project you’re working on right now. What’s the impact of this project, and what do you find rewarding or challenging about it?
One of my projects involves training reinforcement learning models for logistical decision-making in simulated environments of mine sites and, eventually, construction sites.
Caterpillar places significant emphasis on virtual product development, a critical phase in the product development process where the behavior of Cat machines is tested in a virtual simulation environment. This stage is essential for validating the designs and performance of new machines, particularly those for which no field data exists.
A dedicated team at Caterpillar focuses on creating physics-based models of these machines, working closely with engineering teams responsible for various machine components.
Electrification, the focus of one of my current projects, introduces a unique set of challenges and constraints, including charging and battery management, which must be considered in the development process.
Creating simulated environments can help identify these challenges and opportunities that may arise in the real world. Through digital solutions, we can test and understand real-life operations using AI technologies, connecting the physical and the digital world.
What’s the culture like on your team? Are there any rituals or practices that enable team members to grow their knowledge and connect with each other?
Within Caterpillar, we utilize multiple communication channels. I collaborate with two teams, both of which operate similarly, and keep in touch with one another, despite being located around the globe.
We hold virtual meetings a couple of times a week to discuss project progress, address any blockers and seek assistance as needed. These meetings are concise, with additional breakout sessions for more in-depth collaboration or preparation. We also have weekly virtual meetings dedicated to showcasing new technologies, repositories and libraries that we find interesting and useful.
We also connect in person. A few times a year, we have team bonding activities that can take place at one of our locations: Colorado; Peoria, Illinois; and Chicago, of course.
Aside from our meetings, our culture drives me to learn more. My manager encourages me to engage with academic research groups, attend conferences and experiment with new technologies. Additionally, Caterpillar offers training programs focused on our products’ design and creation, helping us better understand them.
This supportive environment makes it straightforward to build a stable and fulfilling career.
IMC is a leading market maker, trading on more than 100 exchanges around the world.
Describe a typical day for you. What work do you tackle, who do you collaborate with and what tech do you use?
As an engineer on the data analytics team, I work mostly with our trading teams, which consist of traders, quantitative researchers and fellow engineers.
My team works on data-driven projects and builds software to enable scalable analysis. We use a combination of proprietary solutions, written in house, and open-source technologies like Python, Spark and Dask. We write software and features and are responsible for our own operations.
My day usually starts with beginning-of-day ops, like checking production systems. If there are any scheduled maintenance, we make sure things run OK across systems. We tackle this as a team. I then pick up my sprint work, which includes writing code, reviewing my colleagues’ code, discussing with stakeholders, making sure my team has what is needed to make forward progress and and more. Sometimes, I will have a brainstorming session with teammates over coffee.
IMC teams fill our calendars with fun events, like the recent month’s push-up challenge; I take part when I can. Depending on our sprint cadence, we will have team huddles, as well as sprint reviews, or sprint ceremony-style deep dives.
Describe a project you’re working on right now. What’s the impact of this project, and what do you find rewarding or challenging about it?
One theme of my team's involvement is the streaming data analytics product area. Streaming means live. This is an important frontier for our data organization. We want our users to push the boundaries of data analysis, and we want to help them effortlessly build analytics to address problems at scale and use them to guide real-time decision making.
Every project is unique. Some of our applications hinge on latency, others focus on scale; most of our users want both.
What I find most rewarding is that we get to collaborate with various trading teams across the firm in order to understand the business motivation. We get to work with uncertainty and build out iterative solutions. Perhaps just as importantly, it is incumbent on us to build out software abstraction and present the solution in a way which is elegant and easy to reason with. We believe that when things are simple to use, it is easier to get right.
What’s the culture like on your team? Are there any rituals or practices that enable team members to grow their knowledge and connect with each other?
Highly collaborative and committed. The team is sizable enough that we can see how our work becomes impactful to the firm, yet agile enough that it often feels like I am working with a few friends “hacking” on fun projects.
Our data analytics team operates in two-week sprints that conclude with a review session. During the sprint review, everyone is welcome to present any show-and-tell topics: code we wrote, bugs we fixed, math problems that motivated the code and design choices that did not work. The format can be as formal as a rehearsed presentation or as spontaneous as drawing system diagrams on the whiteboard. This works quite well and make learning fun and engaging.