Home >Web Front-end >HTML Tutorial >Full text search in Rails using Elasticsearch

Full text search in Rails using Elasticsearch

WBOY
WBOYOriginal
2023-08-31 08:41:051556browse

In this article, I will show you how to implement full-text search using Ruby on Rails and Elasticsearch. Nowadays, everyone is accustomed to typing a search term and getting suggestions and highlighted results for the search term. Autocorrect is also a nice feature if what you're trying to search for is misspelled, as we've seen on sites like Google or Facebook.

Achieving all of this functionality using just a relational database like MySQL or Postgres is not straightforward. So we use Elasticsearch, which you can think of as a database built and optimized specifically for search. It is open source and built on Apache Lucene.

One of the best features of Elasticsearch is exposing its functionality using a REST API, so there are libraries that encapsulate this functionality for most programming languages.

Elasticsearch Introduction

Earlier, I mentioned that Elasticsearch is like a database for search. This will be useful if you are familiar with some of its terminology.

  • Field: A field is like a key-value pair. The value can be a simple value (string, integer, date) or a nested structure (such as an array or object). Fields are similar to columns in tables in relational databases.
  • Document: A document is a list of fields. It is a JSON document stored in Elasticsearch. It is like a row in a table in a relational database. Each document is stored in the index and has a type and unique ID.
  • Type: Types are like tables in relational databases. Each type has a list of fields that can be specified for documents of that type.
  • Index: Index is equivalent to a relational database. It contains multiple types of definitions and stores multiple documents.

One thing to note here is that in Elasticsearch, when you write a document to the index, the document fields are analyzed literally to make searching easy and fast. Elasticsearch also supports geolocation, so you can search for documents located within a certain distance of a given location. This is exactly how Foursquare implements search.

I would like to mention that Elasticsearch is built with high scalability in mind, so it is easy to build clusters with multiple servers and have high availability even if some servers fail. I won't go into detail on how to plan and deploy different types of clusters in this article.

Install Elasticsearch

If you are using Linux, you may be able to install Elasticsearch from one of the repositories. It can be used in APT and YUM.

If you are using a Mac, you can install it using Homebrew: brew install elasticsearch. After installing elasticsearch, you will see a list of related folders in the terminal:

Full text search in Rails using Elasticsearch

To verify that the installation is working properly, start it by typing elasticsearch in the terminal. Then run curl localhost:9200 in the terminal and you should see something similar to:

Full text search in Rails using Elasticsearch

Install Elastic HQ

Elastic HQ is a monitoring plugin that we can use to manage Elasticsearch from the browser, similar to phpMyAdmin for MySQL. To install it, just run in terminal:

/usr/local/Cellar/elasticsearch/2.2.0_1/libexec/bin/plugin -install royrusso/elasticsearch-HQ

After the installation is complete, navigate to http://localhost:9200/_plugin/hq in your browser:

Full text search in Rails using Elasticsearch

Click Connect and you will see a screen showing the cluster status: p>

Full text search in Rails using Elasticsearch

At this point, as you might expect, no indexes or documents have been created yet, but we do have a local instance of Elasticsearch installed and running.

Create Rails Application

I'm going to create a very simple Rails application where you add articles to a database so that we can perform a full text search on them using Elasticsearch. Start by creating a new Rails application:

rails new elasticsearch-rails

Next we use scaffolding to generate a new article resource:

rails generates scaffolding article title:string text:text

Now we need to add a new root route so that we can see the article list by default. Editconfig/routes.rb:

Rails.application.routes.draw do
  root to: 'articles#index'
  resources :articles
end

Create the database by running the command rake db:migrate. If you start rails server, open a browser, navigate to localhost:3000 and add some articles to the database, or just download the file db/seeds.rb with the dummy data I created so you don't have to spend a lot of time Fill in the form.

Add search

Now that we have our little Rails application containing the articles in the database, we're ready to add search functionality. We'll start by adding references to two official Elasticsearch Gems:

gem 'elasticsearch-model'
gem 'elasticsearch-rails'

On many websites, it is common to have a text box for search in the top menu of all pages. So I'm going to create a form section at app/views/search/_form.html.erb. As you can see, I'm sending the generated form using GET, so I can easily copy and paste the URL for a specific search.

<%= form_for :term, url: search_path, method: :get do |form| %>
  <p>
    <%= text_field_tag :term, params[:term] %>
    <%= submit_tag "Search", name: nil %>
  </p>
<% end %>

Add a reference to the form in the main site layout. Edit app/views/layouts/application.html.erb.

<body>
  <%= render 'search/form' %>
  <%= yield %>
</body>

Now we also need a controller to perform the actual search and display the results, so we run the command rails g new controller Search to generate it.

class SearchController < ApplicationController
  def search
    if params[:term].nil?
	  @articles = []
	else
	  @articles = Article.search params[:term]
	end
  end
end

As you can see, I'm calling the method search on the Article model. We haven't defined it yet, so if we try to perform a search at this point, we'll get an error. Also, we haven't added the SearchController's routes in the config/routes.rb file, so let's do this:

Rails.application.routes.draw do
  root to: 'articles#index'

  resources :articles
  get "search", to: "search#search"
end

If we look at the documentation for the gem 'elasticsearch-rails', we need to include two modules on the model to be indexed in Elasticsearch, in our case article.rb.

require 'elasticsearch/model'

class Article < ActiveRecord::Base
  include Elasticsearch::Model
  include Elasticsearch::Model::Callbacks
end

The first model injects the Search method we used in the previous controller. The second module integrates with ActiveRecord callbacks to index each instance of an article we save to the database, and it also updates the index if we modify or delete an article from the database. So it's all transparent to us.

If you previously imported data into the database, these articles are still not in the Elasticsearch index; only new ones will be automatically indexed. Therefore, we have to index them manually, which is easy if we launch rails console. Then we just need to run irb(main) > Article.import.

Full text search in Rails using Elasticsearch

Now we are ready to try out the search feature. If I type "ruby" and click search, the results are:

Full text search in Rails using Elasticsearch

Search Highlight

On many websites, you can see how the terms you searched for are highlighted on the search results page. This is easy to do using Elasticsearch.

Editapp/models/article.rband modify the default search method:

def self.search(query)
  __elasticsearch__.search(
    {
      query: {
        multi_match: {
          query: query,
          fields: ['title', 'text']
        }
      },
      highlight: {
        pre_tags: ['<em>'],
        post_tags: ['</em>'],
        fields: {
          title: {},
          text: {}
        }
      }
    }
  )
end

By default, the search method is defined by gem 'elasticsearch-models' and provides the proxy object __elasticsearch__ to access the wrapper class of the Elasticsearch API. Therefore, we can modify the default query using the standard JSON options provided by the documentation.

The search method will now wrap results that match the query with the specified HTML tag. To do this, we also need to update the search results page to safely render HTML tags. To do this, edit app/views/search/search.html.erb.

<h1>Search Results</h1>

<% if @articles %>
  <ul class="search_results">
    <% @articles.each do |article| %>
      <li>
	    <h3>
	      <%= link_to article.try(:highlight).try(:title) ?
		      article.highlight.title[0].html_safe : article.title,
              controller: "articles", action: "show", id: article._id %>
	    </h3>
	    <% if article.try(:highlight).try(:text) %>
          <% article.highlight.text.each do |snippet| %>
          <p><%= snippet.html_safe %>...</p>
        <% end %>
      <% end %>
    </li>
  <% end %>
</ul>
<% else %>
  <p>Your search did not match any documents.</p>
<% end %>

Add CSS styles to app/assets/stylesheets/search.scss for highlighted tags:

.search_results em {
  background-color: yellow;
  font-style: normal;
  font-weight: bold;
}

Try searching for "ruby" again:

Full text search in Rails using Elasticsearch

As you can see, highlighting search terms is easy, but not ideal because we need to send a JSON query and as the Elasticsearch documentation specifies, we don't have any kind of abstraction.

Searchkick gem

The Searchkick gem is provided by Instacart and is an abstraction on top of the official Elasticsearch gem. I'm going to refactor the highlighting functionality so we first add gem 'searchkick' to the gemfile. The first class we need to change is the Article.rb model:

class Article < ActiveRecord::Base
  searchkick
end

As you can see, it's much simpler. We need to reindex the article again and execute the command rake searchkick:reindex CLASS=Article. In order to highlight search terms, we need to pass an additional parameter from search_controller.rb to the search method.

class SearchController < ApplicationController
  def search
    if params[:term].nil?
	  @articles = []
	else
	  term = params[:term]
	  @articles = Article.search term, fields: [:text], highlight:  true
	end
  end
end

The last file we need to modify is views/search/search.html.erb because searchkick now returns results in a different format:

<h2>Search Results for: <i><%= params[:term] %></i></h2>

<% if @articles %>
<ul class="search_results">
  <% @articles.with_details.each do |article, details| %>
    <li>
	  <h3>
	    <%= link_to article.title, controller: "articles", action: "show", id: article.id %>
	  </h3>
      <p><%= details[:highlight][:text].html_safe %>...</p>
	</li>
  <% end %>
</ul>
<% else %>
  <p>Your search did not match any documents.</p>
<% end %>

Now it’s time to run the application again and test the search functionality:

Full text search in Rails using Elasticsearch

请注意,我输入了搜索词“dato”。我这样做的目的是为了向您展示,默认情况下,searchkick 设置为分析索引的文本,并且更允许拼写错误。

自动建议

自动建议或预先输入可预测用户将输入的内容,从而使搜索体验更快、更轻松。请记住,除非您有数千条记录,否则最好在客户端进行过滤。

让我们首先添加 typeahead 插件,该插件可通过 gem 'bootstrap-typeahead-rails' 获得,并将其添加到您的 Gemfile 中。接下来,我们需要向 app/assets/javascripts/application.js 添加一些 JavaScript,以便当您开始在搜索框中输入内容时,会出现一些建议。

//= require jquery
//= require jquery_ujs
//= require turbolinks
//= require bootstrap-typeahead-rails
//= require_tree .

var ready = function() {
  var engine = new Bloodhound({
      datumTokenizer: function(d) {
          console.log(d);
          return Bloodhound.tokenizers.whitespace(d.title);
      },
      queryTokenizer: Bloodhound.tokenizers.whitespace,
      remote: {
          url: '../search/typeahead/%QUERY'
      }
  });

  var promise = engine.initialize();

  promise
      .done(function() { console.log('success'); })
      .fail(function() { console.log('error') });

  $("#term").typeahead(null, {
    name: "article",
    displayKey: "title",
    source: engine.ttAdapter()
  })
};

$(document).ready(ready);
$(document).on('page:load', ready);

关于前一个片段的一些评论。在最后两行中,因为我没有禁用涡轮链接,所以这是连接我想要在页面加载时运行的代码的方法。在脚本的第一部分,您可以看到我正在使用 Bloodhound。它是 typeahead.js 建议引擎,我还设置了 JSON 端点来发出 AJAX 请求来获取建议。之后,我在引擎上调用 initialize(),并使用其 id“term”在搜索文本字段上设置预输入。

现在,我们需要对建议进行后端实现,让我们从添加路由开始,编辑 app/config/routes.rb

Rails.application.routes.draw do
  root to: 'articles#index'

  resources :articles
  get "search", to: "search#search"
  get 'search/typeahead/:term' => 'search#typeahead'
end

接下来,我将在 app/controllers/search_controller.rb 上添加实现。

def typeahead
  render json: Article.search(params[:term], {
    fields: ["title"],
    limit: 10,
    load: false,
    misspellings: {below: 5},
  }).map do |article| { title: article.title, value: article.id } end
end

此方法返回使用 JSON 输入的术语的搜索结果。我只按标题搜索,但我也可以指定文章的正文。我还将搜索结果的数量限制为最多 10 个。

现在我们准备尝试 typeahead 实现:

Full text search in Rails using Elasticsearch

结论

如您所见,将 Elasticsearch 与 Rails 结合使用使搜索数据变得非常简单且快速。在这里,我向您展示了如何使用 Elasticsearch 提供的低级 gem,以及 Searchkick gem,这是一个隐藏了 Elasticsearch 工作原理的一些细节的抽象。

根据您的具体需求,您可能会很乐意使用 Searchkick 并快速轻松地实施全文搜索。另一方面,如果您有一些其他复杂的查询,包括过滤器或组,您可能需要了解有关 Elasticsearch 上查询语言的详细信息,并最终使用较低级别的 gem 'elasticsearch-models' 和 'elasticsearch-导轨”。

The above is the detailed content of Full text search in Rails using Elasticsearch. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Related articles

See more