How Canva Scaled Their Search to Handle 1M+ Searches Per Minute

Canva massively leveled-up their search by stripping away duplicated processes

Richard Oliver Bray

Oct 23, 2024

Article voiceover

1×

0:00

-6:40

Canva is a web-based design tool. It can be used to create graphics, presentations, videos, and more.

It's also insanely popular, with over 170 million users worldwide creating over 180 designs every second.

Canva's search is 'foundational' to its success. It allows users to search for templates, design assets, and other media.

It handles over 20,000 requests every second and 1 million every minute. But it has significant architectural issues that causes downtime and performance problems.

Here's how they addressed it.

Estimated reading time: 4 minutes 35 seconds

Why Does Canva Have Search?

If you have never used Canva, you may be wondering why its search can be so pivotal.

Well, it has a huge content library. With over 100 million stock images, video and graphics elements, as well as 600,000 templates.

It's users commonly search for media assets to help with designs. For example balloon or assets would help with a birthday card design.

The search functionality had 4 different servers and search indexes for each category. Media (images, videos, graphics), templates, fonts and audio.

There was some shared code between the servers but for the most part, they were completely separate.

This meant any updates to the search would have to be done 4 times. It was also difficult to do things like A/B testing without lots of duplication.

The team needed to find a way to reduce repeated code.

Sidenote: Search Index

Imagine you have a database of sentences, and you wanted to search the text "brown dog."

By default, a traditional database search would return an exact match. So, the sentence "The brown dog jumped over the fence" would be returned. But, "The dog brown jumped over the fence" would not be returned.

A search index is a data structure designed to help with this problem.

It works by breaking the text into individual words or tokens, then returns all the sentences that contain those words.

It also stores things like word frequency, the position of the word in the sentence, and caches these results for faster searches.

This of course, takes up more storage space and processing power. But it makes searching much faster.

A popular piece of software for creating search indexes is Apache Lucene. We will talk more about this later.

How Search Actually Worked

Even though Canva's search wasn't as complex as Google's, it still had many steps to go through.

These are the steps that would take place if a user searched for "brown dog."

Rewriting: Transform the query into standardized text. In the case of "brown dog," not much would be done. But if it had any spelling mistakes, was in another language or had uppercase letters, it would be rewritten.
Tokenization: Split the text into individual words or tokens.
Annotation: Add extra information (metadata) to expand the tokens and the search query. For example, it figures out that "dog" is an animal and "brown" is a color.
It can also be configured to find synonyms, like chocolate or auburn for "brown," and canine or hound for "dog."
Candidate generation: This step reduces a large amount of data using various techniques. The goal is to create a smaller set of results based on the annotated query.
Re-ranking: Reorder the narrowed results based on relevance. For example, if a user had searched for a vet before, brown dogs with vets might be shown first.

After these five steps, the results are returned to the user. This is known as a search pipeline, and these were the core steps that were repeated across the four different search servers.

The team planned to separate these steps into their own components. This meant individuals could contribute without having to understand the entire system.

They also wanted to change the candidate generation step to use Elasticsearch instead of Solr.

Sidenote: Elasticsearch vs. Solr

Both Elasticsearch and Solr are platforms designed to efficiently search through search indexes.

They're both built on top of Apache Lucene, the Java-based open-source software used to create search indexes.

But, they do have some differences, which mostly favor Elasticsearch.

Architecture: Elasticsearch is more optimized for scale because it uses a distributed architecture. Solr uses a more traditional client/server architecture. This requires more manual configuration for scaling.
Querying: Solr is great for text search, but Elasticsearch is better at filtering, grouping data, and real-time indexing.
Ease of Use: Solr can be complex to set up for beginners, but Elasticsearch is much easier. It has better documentation and a more user-friendly interface. It also has a larger active community with many plugins and extensions.

You can easily see why Canva switched from Solr to Elasticsearch. For more details, the team has put together a detailed article on their reasons.

The Migration

The decision to create components wasn't made overnight. There were many prototypes and experiments that led to this.

The team also paid special attention to creating a stable and clean interface. Making sure they followed good software design practices.

They came up with three criteria for each component:

Transient: Each component can be removed and reintroduced without any issues.
Stateless: Components should manage their own state and not share it with others. For example, the candidate generator doesn't need to know how many annotations have been cached.
Ordered: Each component processes data in the search pipeline order. Apart for the Annotation and Candidate generators. These steps can occur simultaneously. Why?
As soon as the first set of annotations are processed—such as 'brown dog,' 'chocolate,' and 'canine'. They can be sent to the candidate generator to fetch some basic results.
The annotation generator can continue producing more synonyms. These can then be sent to the candidate generator to refine the initial set of results.

After lots and lots of work, a component library that each search server could use was created.

The team wrote very detailed articles on the problems. This includes details about their process for creating the component library from the search pipeline. But, there wasn't much in the way of an updated architectural diagram.

So I've taken some creative freedom in drawing this new diagram based on my understanding of their solution.

The new architectural model still has many servers, but each has identical search pipeline components.

This change not only helped Canva release search updates quicker. It also allowed components to scale horizontally in the server based on the load. And allowed them to add observability and monitoring to each component of their search.

Wrapping Things Up

When I first looked at this article on the Canva blog, which has a part 1 and a part 2, I wasn't sure if it would be interesting enough to write about.

But I'm surprised at how much I could get out of it. Who knew search engines for design tools could be so complicated?

As usual, if you enjoyed this article, go ahead and subscribe to get the next one as soon as it's written.

PS: Enjoyed this newsletter? Please forward it to a pal or follow us on socials (LinkedIn, Twitter, YouTube, Instagram). It only takes 10 seconds. Making this one took 25 hours.

Hacking Scale by Better Stack