4

Part 1: Supercharging search for ecommerce solutions with Algolia and MongoDB —...

 1 year ago
source link: https://www.algolia.com/blog/engineering/supercharging-search-for-ecommerce-solutions-with-algolia-and-mongodb-use-case-architecture-and-current-challenges/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Part 1: Supercharging search for ecommerce solutions with Algolia and MongoDB — Use-case, architecture, and current challenges

Jul 14th 2022 engineering

Part 1: Supercharging search for ecommerce solutions with Algolia and MongoDB — Use-case, architecture, and current challenges

We invited our friends at Starschema to write about an example of using Algolia in combination with MongoDB. We hope that you enjoy this four-part series by Full Stack Engineer Soma Osvay.

If you’d like to skip ahead, here are the other links:

Part 2 – Proposed solution and design

Part 3 – Data pipeline implementation

Part 4 – Frontend implementation and conclusion


Hey, I’m Soma and I’m part of a relatively small consultation team that maintains multiple real-estate ecommerce solutions for one of our most high-profile clients. These apps enable end users to sell, buy, or rent homes across multiple countries. They aren’t classical webshops though. In some cases, we just connect supply and demand, but in the case of renting, we actually manage transactions, orders, cancellations, and communication between the parties. We’ve been expanding the scope of these services greatly in the past few years, which allowed our client to explore new business opportunities and invest both time and resources to add new features, upgrade the infrastructure, and focus on data-driven decision making.

We recently added a lot of new services to the webshops, like:

  • Social media integration
  • Advanced SEO
  • Possibility for real-estate agent registration
  • AI-based recommendation system
  • Real-estate history
  • Neighborhood analysis

Because of these new services and organic growth, the number of listings on the websites have increased dramatically. That brought focus to the fact that our search capabilities were not meeting our clients’ demands. Search results were often irrelevant, inaccurate, and slow, fostering a bad experience for the users and a bad reputation for the company.

Providing accurate real-estate search results is a challenge because there are many relevant attributes. Customers are expecting to search by:

  • Geographical location
  • Type of neighborhood (urban, suburban, frequented, quiet, etc)
  • Type of building (newly built, old, industrial)
  • Proximity to public transportation, schools, sport facilities, etc
  • Price range
  • Energy and green rating
  • Eligibility for certain loans or other financing solutions

Customer surveys indicated that poor search results translated directly to a loss of revenue for our client. Customers were leaving the websites after only a few unsuccessful searches! Control groups showed that only those who were persistent and used simpler search terms were finding the listings they were looking for. This made it abundantly clear that providing fast, accurate and relevant search results would raise the click-through rate for the product, so it was designated as a top-priority feature in the next release cycle.

My task is to come up with a solution to improve the search capabilities of the ecommerce sites. This blog post goes through my journey in implementing a basic proof-of-concept search solution that we can build on later to create advanced search for our services.

Current webshop architecture & data pipeline

The architecture for this particular webshop is somewhat fractured. The consumer-facing and admin-facing applications are completely separate, managed by different teams and only connected on the data level.

The admin application is used by real-estate owners and agents to list their available properties for buying and renting, manage rent orders, and see stats, among other things. This application isn’t managed by my organization unit. It connects to a single database that contains the application data.

The consumer application is marketed, available publicly, listed in Google, and has advanced SEO. This is the one my team is responsible for, and the one we’ll be improving in this article series. Consumers use this app to list properties that they are interested in buying and renting, and it allows them to handle financial transactions, communicate with other users, and so on.

Architecture

Our current architecture looks like this:

Diagram of current architecture

Challenges & possible solutions

Our main challenge is that MongoDB is limited in the search department. As it stands, the cloud hosted version of MongoDB supports Atlas Search, which is a fine grained indexing solution, but it isn’t applicable to us since we are using an on-premise MongoDB instance, which only supports Legacy Text search. It can’t search across multiple fields and through significant textual content.

The legacy text search poses a couple issues for us:

  • The $text operator does not support full-text search. It can search for phrases and keywords, but that is often not enough.
  • It can’t rank results based on relevance to the query.
  • It doesn’t let us easily search through multiple fields and arrays.
  • Words with diacritics (like é or õ) should be treated as identical to the version of the word without diacritics (e and o, in this case), but that’s very slow..

Because of these problems, our app doesn’t provide a smooth search experience for our end users.

We’ve experimented with an on-premise hosted version of ElasticSearch (and using the MongoDB River Plugin for synchronization), but hosting & fine-tuning ElasticSearch proved to be difficult for our team. We ended up with a very unpredictable searching experience which was not only slow on many occasions, but also inaccurate.

To add to that, an ElasticSearch integration would require us to modify our entire application stack! Both the backend and the frontend of our application would need to be modified heavily to make it work. We would have to manage application security, write our own UX code, and write complex indexing logic in code. We also have a lot of legacy code in our application, so there is a high cost to modifying, testing, and deploying new versions, as it requires extensive domain knowledge and experience that only a few developers here have.

Seeing these problems, we gave up on ElasticSearch. The development cost in both time and resources was just too great for the benefit that it would bring.

Algolia

After doing some internal and external research, I found that other teams inside my company are using a solution called Algolia to improve their products’ search. They told me a great deal about the quality of the SDKs it provides and how simple it is to integrate into existing systems without having to change the backend code at all.

Take a look at this feature list!

  • It’s cloud native and promises very high scalability.
  • We can minimize the coding required to enable the capability, especially on the backend, which would definitely be a bottleneck.
  • Our data engineers and domain specialists can collaborate on the project and create value together instead of separately.
  • It simplifies the search development process, since it doesn’t require in-depth technical knowledge about defining optimized indexes.
  • It’s low maintenance and easy to organize large data volumes
  • It’s simple to manage and configure search rules to improve search accuracy, even giving us the option to fine-tune with AI.
  • There are very easy-to-use frontend SDKs that can integrate with our existing frontend application.
  • It provides pre-configured analytics and KPIs, reducing the development cost even more.
  • It supports sending events from the frontend to further improve accuracy and power analytics.

For these reasons, I decided to build the PoC for our new search indexing capability using Algolia.

In the second article of this series, I will cover the design specification of the PoC and talk about the implementation possibilities.

In the third article of this series, I will implement the data ingestion into Algolia and figure out how to keep that data up-to-date.

In the fourth article of this series, I will implement a sample frontend so we can evaluate the product from the user’s perspective and give the developers a head-start if they choose to go with this option.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK