How 8 Engineering Leaders Build With Scalability in Mind

Creating a product that can grow whenever it needs to requires creative thinking, innovative leadership and the right tech stack to do it.

Written by Taylor Rose
Published on Oct. 27, 2023
How 8 Engineering Leaders Build With Scalability in Mind
Brand Studio Logo

So much of urban planning seemed like a good idea at the time.

It’s why highways are often raised above the rest of the city instead of nestled below it, like the Big Dig did in opening up more than 300 acres of land in Boston. It’s why many water pipes are not holding up under the test of time and the growing populations of many U.S. cities. It’s why airports are often too small. It’s why traffic in Atlanta is so bad. It’s why many neighborhoods are built for cars instead of walking. 

At the time, all of these choices seemed like the right approach. What none of these initial designs had in mind was growth and a changing environment. 

Scalability is about anticipating growth and change. And scalability applies to many things beyond urban planning — including businesses and technology. 

Built In sat down with eight engineering leaders who have led their teams through projects that had to be flexible and ready for growth at a moment’s notice. 
 

 

Sumeet Lakhanpal
Director, myQ App Development • Chamberlain Group

Chamberlain Group provides technology and products that give customers seamless and secure access to their homes and businesses.   

 

Describe what scalability means on your team.

Scalability, to me, is twofold — scaling product and scaling production. In reference to the product, it means the ability of the application to support more users without any performance impact. It also means the ability of the team to add new features to the application quickly and without major re-architecture or a full-scale development effort.

Scalability isn’t just about the application; the processes we use to build it are equally important. Chamberlain Group is transforming rapidly as an organization and our processes must transform and scale so that we can achieve our ambitious growth objectives. Scalability is increasingly important for us as our myQ app user base grows to tens of millions, as we continue to launch new connected hardware and as we support more and more innovative use cases to “make access simple” for our customers.

Chamberlain Group is transforming rapidly as an organization and our processes must transform and scale so that we can achieve our ambitious growth objectives.”

 

How does the need for scalability impact your architecture?

The importance of scalability is embedded in our culture. Scalability of the application, product or feature is always top of mind in the development process because it’s essential to a successful, long-term outcome. 

At Chamberlain Group, we work hand in hand with the business from the start to understand and define requirements. Having a deep understanding of current and future plans helps streamline current development efforts and prevent reengineering or re-architecture work down the road.

We also have a well-defined architecture and design review process, which ensures we think about reusability and impact on infrastructure well before a scalability challenge presents itself. Before launch, we load tests to ensure that it will withstand any scaling. Once we launch we monitor, evaluate and assess continuously to make sure we’re learning and improving to make future launches even more successful.

 

What tool does your team use to support scalability, and why?

The very best tool we have at our disposal is our amazing and talented team. Chamberlain Group hires the best tech talent with experience building scalable solutions and we place a huge emphasis on continued education. Staying up on trends and industry best practices is how the team grows as tech professionals and ensures we’re creating best-in-market, disruptive products and solutions.

In terms of technology, it’s all about the architecture we’ve built and continue to improve iteratively. We use microservices that can be easily scaled, supported by component-based architecture in the front end. We’ve built reusable libraries for building products and features quickly, and we’ve integrated tools into our CI/CD to ensure code quality. We’ve automated load and performance testing which run before every new feature and we use multiple best-in-class tools to continuously monitor our applications.

 

 

Maria Tzeka
Data Engineer Team Lead • Adyen

Adyen is a financial technology company that provides end-to-end payment options and data insights. 

 

How do you preemptively prepare for unexpected growth? 

Scalability is about doing more with less and being prepared. First thing that comes to mind with scalability is the technical aspect — can we handle a large amount of requests or data? You can’t wait for the requests to show up, and adding people to the team isn’t always the answer either. To think at scale, you have to be intentional with processes: see if steps can be eliminated and areas streamlined to simplify the development process.

 

How do you build tech with scalability in mind?

I’m a big advocate of separating concerns and breaking down the code into modular components. This makes it easier to plug and play in the development process. 

There are always challenges, whether it’s inheriting old code that may not meet our standards or understanding how to generalize common functionality for constant evolution — all of these require adaptability. I recommend starting with something basic and building upon it with more use cases. As a result, the code becomes much easier to scale, and you’re able to finish projects faster by reusing code.

I’m a big advocate of separating concerns and breaking down the code into modular components. … As a result, the code becomes easier to scale, and you’re able to finish projects faster by reusing code.” 

 

What tools or technologies does your team use to support scalability, and why?

As our data scales, it’s important to use big data technologies such as Spark and HDFS with Hive to allow us to process and store data in a distributed fashion. Spark offers an insightful UI that provides a lot of detail on the workload and processing of each of our jobs. We rely on it heavily to ensure our queries are efficient and we won’t run into memory issues.

 

 

Matt Nassr
Global Data Engineering Lead • Optiver

Optiver is a financial tech company that focuses on pricing, execution and risk management. 

 

Is scalability built into your processes?

Scalability is at the core of everything we do in global data engineering at Optiver. Scalability ensures our data pipelines remain rock solid, processing torrents of information from around the world, day in and day out. We collect and process data from hundreds of global sources, accumulating petabytes (and years) of historical data. Our researchers, traders, developers and analysts rely on pristine datasets that are readily available to make timely and informed business decisions. It’s not just about making things work, it’s about everything working seamlessly, all the time. This level of scalability isn’t just a ‘nice-to-have’; it’s critical to Optiver’s core market-making mission.

 

How do you work on scalability early on?

It all starts with the design. There is no amount of hard work that can compensate for a lack of scalability in the initial blueprint. This means that even in the early stages of development, we are thinking about what we can parallelize, what we can break up into multiple stages, and what robust quality checks we can introduce. While each dataset may have its unique quirks, the mindset remains the same: everything we build needs to run like clockwork at a global scale and without any hiccups. At the scale we operate, even extremely rare edge cases need to be completely and proactively solved.

Scalability starts with design. There is no amount of hard work that can compensate for a lack of scalability in the initial blueprint.”

 

What tools or technologies does your team use to support scalability, and why?

We use a combination of technologies — like object storage and NoSQL, versatile languages like C, C++ and Python — to build scalable solutions. Again, I believe that achieving scalability ultimately comes back to the design. Scalable tools themselves will only take you so far – the system design itself must be scalable.

 

 

Celeen Rusk
Senior Software Engineer • Chime

Chime is a financial technology company that partners with regional banks to design member-first products. 

 

How does your team encounter challenges with scalability? 

We often think of scalability in terms of strategy, as in horizontal and vertical scaling: either you add more power to the machines you’re running or you add more machines — and often it’s both. I tend to think of both horizontal and vertical scaling in terms of software systems, not just servers. Horizontal means spreading out the system or its responsibilities, like sharding your database or splitting that monolith into microservices. Vertical means increasing the power in the system. 

I work in Chime’s risk engineering organization. Our software sits between our members and the rest of Chime’s product, providing assessment and assurance in response to risk. An example of this is Authentication, the area I work most in. Our job is to ensure that each account can only be accessed by the member who owns it. To us, scalability means managing a ton of incoming requests and vetting them with other parts of risk engineering. We do this mostly with a few different approaches to horizontal scaling and we have some examples of vertical scaling thrown into our software system.

 

How do you guide your team through this process? 

One way my team approaches scaling is through software and application layers — scaling the software design and architecture appropriately as the domain grows. We try to follow the common wisdom and design our systems so that dependencies are loosely coupled. Best design intentions are only part of the process; software systems are always changing with new requirements, goals, products and features. Some part of scaling means keeping up with that changing system in a way that maintains flexibility as the system’s responsibilities expand.  

Design is only part of the process; software systems are always changing, and some part of scaling means keeping up with that changing system in a way that maintains flexibility as the system’s responsibilities expand.”

 

Let’s use a classic database scaling example to illustrate: running out of integers. It’s happened to most companies above a certain size, but we only hear about it when it causes an incident. Just like monitoring your sql database to make sure you are prepared with a solution *before* you run out of integers, enabling appropriate scaling at the software level requires awareness of the state of your domain, and the ability to prioritize opportunities to expand appropriately, before it’s too late, and your previously loose dependencies end up backed into a corner between all the cool stuff you’ve been iterating so quickly on.

 

What tools are the go-tos? 

We rely heavily on monitoring tools like Datadog to keep eyes on our systems.  We have a very slick CI/CD system that rolls back automatically when canaries fail. Chime started as a Ruby on Rails shop. Some parts of Chime have shifted to using other languages like Go to facilitate vertical scaling in the software layer and add speed to the system. 

In the authentication domain, one approach we take to scaling is to qualify the requests we receive and effectively limit the number of requests we ultimately have to deal with at deeper layers of the stack. Specifically, there are rate-limiting strategies and step-up strategies we can set at the front of the request cycle to try to filter out bot traffic, ingenuine login attempts or even just login attempts that are more likely to result in fraud.

Beyond that we love to lean on NoSql like DynamoDB, which has scaled well in the risk domain broadly. I find that DDB modeling demands are an appropriate fit for our domain and a useful tool in guiding us towards designing appropriate space between services.

 

 

Drew Carlson
Leader, Software Engineering • Cisco Meraki

Cisco Meraki is an information technology company that brings together IT, IoT and physical environments. 

 

How do you think about scalability in your role? 

Scalability can come in all shapes and sizes, and it’s a factor we consider in multiple aspects of our work. It’s often associated with how an application handles a high volume of customers or events simultaneously, but it extends beyond that to maintain Cisco Meraki as an industry leader in cloud-managed networking solutions. 

One way I think about scalability is how it allows us to efficiently manage the critical path of ongoing projects. Each process, whether in a recurring cadence or a one-time project, has its own path. Throughout the day, we encounter multiple paths that interconnect and compound their effects. If your CI/CD pipeline has a large queue preventing developers from verifying their code, this impacts their velocity and leads to bottlenecks in the development lifecycle. 

Scalability applies across various roles, spanning management and individual contributor positions. At Meraki as an engineering manager, I help my team by overseeing team deliverables, enhancing development velocity, facilitating project growth and optimizing infrastructure management. All these elements converge when it comes to scaling and enhancing my team’s performance.

 

How do you tackle the build of a scalable product? 

When working on a project in any field, there are multiple factors to consider to ensure project scalability. First, you must have an architecture and design plan in place. Your feature may be extremely capable, but if it’s too complex then it’ll be challenging to scale. Another factor that will impact scalability is the business logic of an application. It’s imperative to work with stakeholders to outline requirements about the business logic that align with the design plan. Having both plans in place before development will provide a strong foundation from which you can build. 

Once you’ve built your project, it’s critical to sufficiently monitor your application and track any issues before attempting to scale. Expect regular refactoring to be part of the cost of any project. Performance benchmarks across your codebase and feature set will make it easier to determine where to allocate resources and make the biggest impact. In order to scale software, you must plan before you build and test before you scale. 

In order to scale software, you must plan before you build and test before you scale.”

 

What tools or skill sets do you look for when you start a new project? 

There isn’t a single tool or mindset that can solve all scalability issues, and it’s not something a single person can tackle. Depending on the type of problem, you must approach it from a multitude of angles. One of the best things about Meraki is our large pool of incredible engineers specializing in skills that span across the software development spectrum. If a feature calls for a new way of storing petabytes of data, we’ll work with our data infrastructure team to design and architect a solution. If we need to onboard a hundred new developers, we’ll work with our DevOps pipeline team to scale up our CI/CD infrastructure to handle the large influx of requests to run the automated test suite. By working across departments toward a common goal, we can utilize our specialties to work through any problem.  

I consider myself very fortunate to work for a company where we all treat each other as the same team. One of the key principles of Meraki is “everybody in,” and I see that every day with my coworkers. Meraki would not be able to tackle complex problems without the collaboration and comradery across the company.

 

 

David Fruin
VP of Engineering • Vail Systems, Inc.

Vail Systems is a software development company that works to make communication easier for clients. 

 

Describe what scalability means to you.

Processing telephone calls, scalability and high availability are all important and very much related to each other in our business. High availability requirements dictate that our scaled platform must be fully redundant. We run things at N + N across data centers. One of the data centers can be completely down and the other is capable of handling 100 percent of the traffic. We run at N + 1 with no single point of failure within a data center, meaning we need enough hardware and software at more than double the anticipated scale. For instance, if a customer needs to support 10,000 simultaneous phone calls, we would build more than 21,000 ports to accommodate the anticipated call traffic.

We do not always know in advance what sort of capacity a customer will require, so scale should be considered from the very start. Systems may need to be scaled up or down at a moment’s notice. 

For us, scale applies to both the platform as a whole and its components. Phone calls and the people on them are not very tolerant of latency.

 

How have you adjusted for growth over time? 

Scale is in our DNA. Everyone that starts on a project is trained to ask how the solution can scale – it’s been this way since day one. The platform that runs our core software has been improved continually over the past 30 years. You probably wouldn’t be able to find a single line of code from 30 years ago, but the principles that went into our first-generation platform have directly informed the thinking in our latest iterations. We have always run highly available systems in order to meet customer requirements. The one thing that has changed is the sheer scale of the platform itself, which today processes in excess of 1 billion minutes per month. 

Scale is in our DNA. Everyone that starts on a project is trained to ask how the solution can scale.” 

In order to sustain this growth, we maintain a software architecture consisting of multiple “cells” that can be replicated to provide any level of scale we need. Each cell is a fully functioning, self-reliant call-processing unit capable of handling tens of thousands of phone calls. Adding more capacity is simply a matter of adding another copy of a cell.

 

What tools does your team use to support scalability?

To reach scale, we apply many of the standard scaling principles commonly seen in the software world, including heavy use of localized cache, modularity and automatic scale orchestration. These principles are applied to the cells I mentioned earlier. 

Our latest platform is orchestrated entirely by Kubernetes. And a judicial use of proprietary system-health monitoring software keeps cells up and running autonomously. The redundant hardware approach I described earlier allows for individual systems to fail and be repaired by Kubernetes orchestration.

In the cells themselves, we employ a number of technologies to keep them functioning at high efficiency and high availability. This is where local caching becomes important — given that common configuration databases are not expected to be always available. For example, individual systems within cells produce content that must be available to central aggregation systems for things like reporting and billing. To accomplish this, each cell requires database replication, queuing, distributed event stores and stream processing.

 

 

Anthony Spatafora
Manager, Engineering • LogicGate

LogicGate is a software company that helps businesses automate and assess risk compliance. 

 

Describe what scalability means to you. 

Scalability can be defined as “the ability of a computing process to be used or produced in a range of capabilities.” For LogicGate, scalability means more than just being able to handle a large amount of data or a large number of users; it is the ability to increase your dataset without negatively impacting the user experience. You can throw money at a problem by adding more nodes or increasing each instance size to handle the extra load. What it can’t do is ensure a long-lasting solution that scales as you grow.

At LogicGate, scalability is at the forefront of feature ideation, creation and optimization for the Risk Cloud®. Scalability is vital for this, as we want to ensure that our industry-leading GRC platform offers a smooth and easy-to-use experience for our customers and their users.

 

How do you kickstart this tech with scalability in mind?

I always like using a “shift right” mentality when it comes to building features in terms of performance, scalability, user experience and testing. I recommend everyone think about scalability at the earliest stage of planning.

I always like using a ‘shift right’ mentality when it comes to building features in terms of performance, scalability, user experience and testing.”

 

At LogicGate, we encourage our engineers to build off of a base technical specification that lays out possible front-end components, API contracts, back-end services and a data model to tie it all together. This paints a clearer picture and allows for a technical review cycle that brings in staff software engineers, development operations and other SMEs to start asking the tough questions like, “Can our current stack support this?” or “Where might we have a bottleneck?” From there, we utilize existing tools to test our features as they’re built.

We can catch issues that a customer would feel when the feature is released by composing solutions during the technical specification phase of the software development life cycle. At times even the best-laid plans can go awry when put into the wild. Additional monitoring and observability are critical to help drive a reactive approach when things aren’t scaling.

 

What tools does your team use to support scalability?

We utilize AWS with Terraform and a graph database, Neo4j, to help us build out a scalable solution for all our current and future customer needs. With AWS and Terraform, we have the flexibility to increase instance sizes, add memory and scale horizontally with a quick turnaround. It also allows us to build out custom orchestration tools to easily stand up new environments with just the right data that a specific customer may need. In tandem with this, LogicGate leverages DataDog to help with our monitoring and observability.

We also utilize Neo4j to build out highly dynamic and customizable queries. Using a graph database gives us a lot of flexibility to selectively load only the data that is needed. This has empowered engineers to revitalize our application across the board with enhancements to our record searching, reporting and automated job conditions.

 

 

Naman Mehta
Lead Software Engineer • Snapsheet

Snapsheet is an insurance tech company that focuses on auto claims. 

 

Describe what scalability means to you. 

To me, scalability is about being able to expand efficiently. Depending on the context, scalability can be a lot of different things. It can happen as an application load increases with more customers or traffic. It can happen as the company expands into new verticals and products or when the codebase adapts without much overhaul. It could also be when more software engineers are hired. It can also be when your codebase grows with a clear separation of features, concerns and team ownership.

Scalability is important because without it, whatever dimension you are expanding can hit a wall. At the infrastructure level, there could be sluggish response times as requests increase. At the organization level, there could be a rapid expansion of developers but not seeing an increase in development.

Scalability is important because without it, whatever dimension you are expanding can hit a wall.”

 

How do you challenge engineers to think ahead? 

Developers should be encouraged to think long term and ask themselves questions. What if tomorrow, an assumption about this feature I am writing today changes? What if my company builds software for internal employees today but it might be marketed to external customers in the future? If we challenge engineers to think beyond today, we can open up options for the future.
 

 

Read moreWhat Is Scalability and How Do You Build for It? 25 Engineers Weigh In

Responses have been edited for length and clarity. Images provided by Shutterstock and listed companies.

Hiring Now
Caterpillar
Artificial Intelligence • Cloud • Internet of Things • Software • Analytics • Cybersecurity • Industrial