Read my latest article: 8 things I look for in a Ruby on Rails app (posted Thu, 06 Jul 2017 16:59:00 GMT)

When TSearch2 Met AJAX

Posted by Mon, 22 Aug 2005 01:48:00 GMT

Last night, a local PDX.rb-ist, asked about full text searching in PostgreSQL. I pointed him to TSearch2, which is a nice little addon to handle full text searching with indexing, ranking, highlighting, etc. To my knowledge, it’s the closest to a google-like search that you can get with PostgreSQL. Some people in #postgresql (irc.freenode.net), said that you can build custom functions that will allow you to quote content, and do other fun stuff within your search string. We can discuss that another time.

After thinking it over, I thought, “why not put ajax on top of a full text search and see what it can do?”

The first question, where was I going to get a bunch of content that I could search through and have it be somewhat meaningful for the public, if I decide to put it up as a demo page. The RubyOnRails mailing list came to mind, so after seeing that I couldn’t download the full archive from the rails mailman page (at least not that I could tell), I decided that I would just import my Maildir for that mailing list.

This added another initial step. What would be a good way to import the 13,000~ emails that I had in the folder?

I knew that worst case, I could find a module on CPAN and build a perl script to import it… since I didn’t see anything in the standard ruby library. Then I found TMail. Someone said that they think ActionMailer uses TMail as well.

The resulting quick and dirty script became:

#!/usr/bin/env ruby

require 'tmail'
require 'rubygems'
require 'postgres'
require 'dbi'

conn = DBI.connect("DBI:Pg:database=rails_mailinglist;host=localhost;port=5403", "username", "password" )

MAILBOX = ".MailingLists.Ruby.RubyOnRails"

sql = "INSERT INTO archives (sender, recipient, subject, body) VALUES (?,?,?,?)"

@sth = conn.prepare(sql)

box = TMail::Maildir.new(MAILBOX)

box.each do |port|
        mail = TMail::Mail.new(port)
        p mail.subject
        @sth.execute(mail.from, mail.to, mail.subject, mail.body)
end

exit

Not rocket science. :-)

Okay, so I let that start running through the mailing list emails that I have, and opened up another tab in iTerm and typed our friend, rails archives followed by cd archives. The next step was to modify the config/database.yml file.

(you all know how to do that, right?)

Okay, you should still be with me…so far.

After I got my database settings in place, I ran ./script/generate scaffold Archive and watched it created my new filles to play with.

./script/server and I am looking at the first several emails that are in my RubyOnRails mailing list folder. I notice that the first one is the confirmation email from the day that I signed up on the mailing list. Mon, 24 Jan 2005 16:00:14 +0000 (GMT) . So, I delete that email and the ‘welcome to..’ one so that no one sees my mailman password/confirm info. ;-)

Installation

So, Rails has no problem with the data. So, I then head over to the Tsearch2 site and look for some installation information. I walked through this walkthrough

Database Structure

For this example, I kept it pretty simple for the database structure. I believe the create script was:

CREATE TABLE archives (
  id SERIAL PRIMARY KEY,
  sender VARCHAR(255),
  recipient VARCHAR(255),
  subject VARCHAR(255),
  body TEXT
);

The rest was basically following through with those steps and building the triggers and functions around the subject and body fields in the table.

To use the tsearch2 functionality, I used find_by_sql rather than using just find.

@archives = Archive.find_by_sql("SELECT id, headline(body,q) as headline, body, rank(idxfti,q) as rank, sender, subject  FROM archives, to_tsquery('#{@str}') AS q WHERE idxfti @@ q ORDER BY rank(idxfti,q) DESC LIMIT 100") 

The @str variable is a value that I build based on the string(s) that the user is typing in the search field. Tsearch2 requires that you sepeare each string with a pipe (|). So, I put in a few checks on the string that was being passed to my method in my controller by AJAX. (I’ll let you take the time to figure out how to get AJAX in Rails working and watching a text field… it’s not hard to find info on google. ) :-)

The end result?

I will warn you that this does’t work in all browsers, some IE people said they had issues… and I spent enough time tinkering with it to just settle with this for now. :-)

I present… fulltext searching with PostgreSQL on Rails.

There are approx 13,000 emails in the system, so I put a limit on the number of responses that show up to 100.

My Thoughts

Well, it was an interesting concept. I’m not a big fan of livesearching, it doesn’t really seem to buy us much when working with this sort of data. I do find live auto-completion to be quite useful though. It’s not practical to have AJAX peg the database every second as I type for new content and it’s obvious that a database with that much content is not going to respond as snappy as you would hope. However, I decided to compare the speed to searching in Thunderbird and Evolution. From my sophesticated benchmarking suite (my imaginary stop watch)...

AJAX won!

okay, I should be fair and say, Tsearch2 won as it is doing all the heavy lifting.

Enjoy!

Get help with your Rails project

comments powered by Disqus