Search Upgrades

by Ben Ubois

Search has been improved at every level, with new features, software, and hardware. Oh, and it’s about 10 times faster.

Features

There’s a nice new way to search within a feed or tag. When you start typing in the search field, Feedbin will suggest sources to search within. Choosing one of these sources will filter the search to only find results within your selection.

When you already have a feed or tag selected in the source column, the search field will be automatically scoped to the selected source.

There’s also a few new fields that you can use to find exactly what you’re looking for.

You’ve long been able to search by the published date, but this field has a new feature: relative dates.

For example, if you want to set up a saved search to see all your unread articles that were published in the last 24 hours, you could use the query: published:>now-1d is:unread. You can also search for a range. For example if you want all unread articles that were published yesterday, this is how: published:[now-2d TO now-1d] is:unread

Next up, link. Link can be used to search for the presence of links to specific domains. To find an article that links the the New York Times you could search for link:nytimes.com. Link is also fully subdomain aware so you could search for link:cooking.nytimes.com. This field supports the ability to search for multiple values, like link:(nytimes.com OR sfchronicle.com)

Feedbin has become omnivorous in terms of the types of content it ingests. To reflect this direction, search has gained the ability to filter by type. These are the types you can search for:

  • feed
  • newsletter
  • podcast
  • twitter
  • youtube

For the podcast and youtube types, there’s another new field: media_duration.

Say you’re as old as I am, and you never want to see a “short” in your YouTube subscriptions. You could create an Action that marks matches to this query as read:

type:youtube media_duration:<120

Check the Search Syntax help page for the full documentation.

Infrastructure

To power all of this, the search infrastructure was upgraded as well. Feedbin had been using an ancient version of elasticsearch. This was showing its age with poor performance and flaky reliability. Upgrading to elasticsearch 8 fixed the reliability, but the performance still wasn’t great. The 95th percentile response time for a search was hovering around 1.5 - 3 seconds.

To remedy this, a hardware upgrade was needed. These are the specs for the new search server configuration:

  • AMD Ryzen 5950x (16 3.4GHz cores)
  • 128GB RAM (DDR4-3200)
  • 4TB storage (PCIe 4 NVMe)
  • Mellanox 10Gb NIC

Feedbin’s application servers were upgraded to use Ryzen 5000 series CPUs back in 2021 and I’ve been happy with the performance. They’re clocked higher than most EPYC or Xeon parts and don’t come with the premium price tag or high power requirements. The 5950x comes with 16 cores and 32 threads, so it’s actually great for highly concurrent server configurations, as you’re not giving much up in terms of core count.

Once this new configuration was installed at Feedbin’s datacenter, search performance improved dramatically.

95th percentile response time for search queries on Feedbin

The response time is now consistently under 200 milliseconds.