Built In Chicago is kicking off a series profiling the craftsmanship of Chicago’s digital startups that are solving the toughest technology problems from the ground up. This is not a series about the C-level execs or venture capitalists behind these companies, but rather a zoomed-in view of the tech and the products that are making Chicago history. Yes, that’s right, these stories will dive into the nitty, gritty, techie projects that you’ve always (or never) wanted to know about.
kCura’s flagship product Relativity is an e-discovery application that helps law firms and other organizations search, analyze and review electronic information needed for litigation. It’s an impressive example of Chicago craftsmanship that is solving huge problems on an equally huge scale: managing large amounts of digital information. There are 39,388 active cases currently running in Relativity and about 27.8 billion files under management in the software.
The problem of scaling search
One constant challenge for kCura is scaling Relativity so that it can quickly perform complex full text searches, no matter the size of their cases. About four years ago the average case size for large cases in Relativity was “reasonable” at about 500,000 to 1 million emails and pieces of electronically stored information—like PDFs, Word documents, and Excel spreadsheets, Director of Marketing Communications Shawn Gaines said. However, now “the volumes are getting more and more massive” with average case sizes ranging from 10 to 20 million documents—and the largest one bordering 105 million documents.
Building out Relativity’s search capabilities took a unique approach, as the search problem in Relativity is very different than in most applications that require full text search. Here are a few key issues kCura had to look at to tackle this problem.
- Each Relativity install needs to support hundreds and sometimes thousands of containers of documents, which are called cases or workspaces. Each of these workspace collections can contain anywhere from hundreds to hundreds of millions of documents that must be full text searchable.
- How search works in Relativity is much different than a traditional Internet or enterprise search solution. When users search for the term in an internet or enterprise search engine, the engine’s job is to return the items that it believes are the most relevant to that search and rank them in order. Relativity’s job is to return every single relevant item—which can be millions of documents—and then allow the user to quickly act on them by marking them all for further review.
- The work loads of Relativity continually change depending on where the case might be in its life cycle. For example, when users collect data and prepare a workspace, they might be in the process of indexing millions of documents which they need to quickly make searchable. Or, during an early analysis phase, users could be running thousands of complex searches to identify trends and patterns in the case. Users need to allow Relativity’s search solution to be “elastic,” so that server resources could be easily added or removed to support the current workload from a simple web-based interface.
Since the problems Relativity faced with scaling search were unique, kCura chose to develop their own distributed search system, which they affectionately named “Search Grid.”
“To make Relativity search elastic, the architecture behind the software was changed to allow both indexing and search to run on either a single server or a distributed set of hundreds,” Director of Engineering Perry Marchant said. “It really is about how to perform a search on a hundred different servers, spreading parts of the search across and then aggregating the results very quickly, so we can return them back to the user.”
Being able to adjust for multiple types of workloads, Relativity includes a web-based administration tool through which administrators can control how many servers participate in the grid, as well as allocate and de-allocate servers on the fly.
“Being able to scale resources on the fly is one of the unique aspects of Search Grid,” Marchant said. “The system is smart and can instantaneously allocate additional servers on the fly without having to rebalance data across multiple nodes. It’s one of the powerful features of Search Grid that truly makes it elastic.”
kCura’s team spent about eight months on building the initial Search Grid technology, with design, coding and testing initiatives going on simultaneously following an Agile process, Marchant said.
Relativity is built on a Microsoft technology stack, Search Grid included: “Search Grid integrates with our Agent framework which enables distributed processes in Relativity. Each node is exposed a simple REST Server that can be dynamically instantiated on an Agent server. Federating Search Grid results with SQL results is important, so we’ve made extensions in the database (MS SQL Server) to proxy the search request, call the available nodes, aggregate the results, and then combine with a SQL query that controls other aspects of the user’s request,” Marchant said.
The teams behind the technology
Dealing with the problem of scaling Relativity is a team that kCura calls Sharktopus. (All of kCura’s 13 teams go by catchy names such as Sharktopus or Zombie Moose, by the way). Sharktopus consists of seven software engineers (including three in test), a product manager, an engineering manager and also a documentation specialist that all work specifically to make Relativity scale.
How many engineers does it take to put together a 3D shark puzzle? Sharktopus found out during a Friday happy hour.
Teams are the base of the company’s widely renowned culture, which has repeatedly earned the 370-person team culture and innovation accolades. Since CEO Andrew Sieja founded it in 2001, kCura has always been innovating. Because Relativity isn’t just replacing an existing technology in a faster or cheaper way, innovation is the only answer to such challenging technology problems. Through innovation, the team has overcome these challenges and built Relativity from the ground up.
And just as much as they are innovating, they are growing: the company has grown over 1,600 percent since 2008 and made 152 hires in 2013 alone. Currently, kCura has over 40 open positions, about 45 percent of which are in product development.
“Developers want to be working on problems that are interesting and that are going to be more widely used in the future, so that they continue to build out their breadth of experience,” Marchant said.
kCura, with its complex technology problems, provides that opportunity for Chicago developers. And as kCura’s development team continues to come face-to-face with the toughest tech challenges, they are actively crafting Chicago tech history.
“It’s exciting to be in a challenging industry where our team has the opportunity to tackle unique problems that improve our users’ experience in our software,” Sieja said. “Chicago has been a great place to build our business - and we’ll work hard to keep up the awesome momentum.”
Have a lesson learned, an announcement or a recommendation for a future story? Email [email protected].