Saturday, 28 June 2008

Rails 2.1 caching - nothing is ever easy!

Last night I watched the new Railscast episode that talked about the new caching features in Rails 2.1. I thought it looked cool and would add it to the new round of FilmAmora changes.

But... nothing is every easy!

I may look simple, but I encountered several problems

1. Conflicts with GetText

We use GetText for the translations on FilmAmora. We like it because there are free poEditor apps on every platform and we can easily send off the files to whomever to be translated.

The problem comes with this line:
require 'gettext/rails'


That you need to have to fire up some of the Rails-specific GetText stuff. This is all fine - it has been working for quite some time. But, when I tried to cache our Genres like this:
  def self.all_cached(language)
key = "genres_#{language}"
Rails.cache.fetch(key) {Genre.find(:all).sort_by {|genre| genre.get_description}.reject {|g| g.get_films_count == 0}}
end


Here is the result:
undefined method `cache' for GetText::Rails:Module


What?!?! Yes, it seems that GetText::Rails will hide Rails. This is, quite frankly, SHIT. So, after a long time poking around I have discovered that you need to do this:

  def self.all_cached(language)
key = "genres_#{language}"
::Rails.cache.fetch(key) {Genre.find(:all).sort_by {|genre| genre.get_description}.reject {|g| g.get_films_count == 0}}
end


Yippee! It will all work now, right?

Wrong.

2. Class != Class

I hit refresh the first time and wow was I excited! I saw lines like this in the log:
Cache write (will save 0.59562): genres_es


Woo hoo! Look at all the time I will save!

So I hit refresh.
undefined method `get_description' for #<Genre id: 1, description: "Action and Adventure", order_by: 2>


What?!?!

I am using the 'default' memory store. Something is going funny with retrieving objects from it. I never solved this problem.

If I do this in the console:
>> @genres = Genre.all_cached("es")
>> @genres = Genre.all_cached("es")


I see this in the log:
Cache write (will save 0.51974): genres_es
Cache hit: genres_es ({})

And I can do this:
>> @genres[0].get_description("es")


So what is going on in my web app? I have no idea. It seems that the class retrieved from the memory cache is incomplete in some way. get_description is not an accessor, it does go off and get the translation, but... so what? it is still a method.
This had me stumped!

3. File Store no worky
So I added this to development.rb:
config.cache_store = :file_store, '/cache_store'

I ended up getting this:
undefined method `get_description' for #<String:0x5459f0c>


Ok, I am tired of this now.

4. MemCached
Everyone is talking about using MemCached for this kind of thing. Now, I know developers and we are a lazy bunch. My guess is that this whole caching stuff has been written with MemCache in mind and screw anything else (see points 2 and 3 above).
So I installed MemCache and changed the line in the development.rb to be this:
config.cache_store = :mem_cache_store


I restarted the server and hit refresh. Trying to contain my excitement I saw a properly rendered page.
I hit refresh again.
undefined class/module Subgenre

For CHRIST'S SAKE!
After another web-scouring exercise I discovered a solution.

My method now looks like this:
  def self.all_cached(language)
key = "genres_#{language}"
Subgenre
::Rails.cache.fetch(key) {Genre.find(:all).sort_by {|genre| genre.get_description}.reject {|g| g.get_films_count == 0}}
end


And guess what? It works. Of course it now means running memcache on my local machine and installing it on the production box. But I've saved .52 seconds! It took 3 hours to get to the solution, so I figure in only 350 web page hits I will make it back.

Oh, of course I was already caching the html on the server for most things, but this is a little tidier.

Friday, 27 June 2008

Linking offsite

When you get into something like Ruby on Rails you want to use it for everything - the old adage about getting a hammer and everything becomes a nail rings true in programming!

I wanted to link to Wikipedia. I thought I could use link_to for something like that, but it turns out that really link_to (and it's variations) are meant to link to something inside your RoR app, not outside of it.

So I wrote this handy method that generates a link off to Wikipedia.

def wikipedia_link(text, language = current_language.to_s)
"<a href=\"http://www.wikipedia.org/search-redirect.php?search=#{h(text)}&amp;language=#{language}\" target=\"new\">Wikipedia</a>"
end

It's fairly simple - pass in the text you want to find on Wikipedia and your language.

Thursday, 19 June 2008

IE and form spacing

One has to wonder how Microsoft got it so wrong, but anyway.

One thing I love about RoR is the ajax side of it and how easy it makes it. For instance adding something to a cart with ajax:

<% form_remote_tag :url => {:controller => :store, :action => :add_to_cart, :id => film} do %>
<%=submit_tag(_('Buy &raquo;'), "title" => _("add to cart @ %s") % number_to_currency(film.price, :unit => "€"), :class => "buy_button_big buy_button")%>
<% end %>



That is great. On FireFox it looks like this:


But on IE6 it looks like this:


Insane quatities of extra space. Seems that IE loves to add space after forms. You can't stop it!

Or can you!?

I have found that adding this in my css is the trick:

form {
margin:0;
padding:0;
}


But, since I don't need it in ff... why not use the very sweet CSS Browser Selector plugin.

Now you can do this:

.ie6 form {
margin:0;
padding:0;
}


Add all shall be well in the garden.

Tuesday, 17 June 2008

pull from another site

Recently a site I was working on needed to forward all unknown requests to another website (their old website). This turned out to be shockingly easy to do!

In the Application Controller:

  def default
logger.debug("request.env[\"REQUEST_URI\"] #{request.env["REQUEST_URI"]}")
render :text => Net::HTTP.get_response(
"www.oldsite.com",
request.env["REQUEST_URI"]
).body
end



And at the very bottom (so it is only done if nothing else catches the route):

  map.connect '*path' , :controller => 'application' , :action => 'default'



Easy peasy!

Saturday, 14 June 2008

Format My Source Code for Blogging

Format My Source Code for Blogging

Absolving your views of responsibility

One thing I struggled with when I started out with RoR was how to create a site that was 'current looking'. In all the books they told you about views and layouts, but not how to use the layout to its most effective end.
What I wanted was a simple 3 column website. But instead of 'static' information in the left and right column I wanted it to change. In some cases it was ads, in some cases a list of items, etc. The only thing I could think of was to hold a lot of layout information in the view. So, I would have the header and footer in layouts/application.html.erb but in the middle there was a big <%= :yield %> that would handle the middle (vertically) of the page, including the columns.
(Hold on, I'm getting to the code!)
The problem with this is that if your view is wrong, or if you don't include it, then your site goes all to hell.

The solution, I found, was named :yields.

To illustrate I am going to build a funky 3 column site in 5 minutes.

Create the project

$ rails threecol


To set up a new project

Neat tip: open TextMate and drag the threecol folder onto the icon in the doc. That way it will open up the whole structure (i.e. the rails project).

Add a layout

Matthew James Taylor has a great website with some free to use CSS templates. So, we're going to use his 3 column one.

Go here. View source, copy the CSS.

Create a new file in your public/stylesheets directory called site.css and paste the CSS from Matt's page in.

Include the CSS in your Layout

Create a file in your app/views/layouts directory called application.html.erb. This is the name for the default layout, so you don't need to change anything in the controller we'll create in a moment.

Go back to Matt's page, view source and paste the whole thing into application.html.erb. Delete the styles definitions from here. (lines 12 - 170 as of this writing)

Knock up a scaffold

(Best to setup the database.yml here first)

$ script/generate scaffold product name:string description:string


note: in Rails 2 you don't have the same scaffolding as before. doing the above, where you define some fields, will give you a controller, model, views and migration

Get everything setup:

$ rake db:drop db:create db:migrate test


Get your CSS on

Delete products.html.erb from the layouts directory. For our example we don't need it.

Back in your application.html.erb paste this line in the header:

<%= stylesheet_link_tag 'site' %>


If you start the server (script/server) now and go to localhost:3000/products you should see Matthew's page all nice and purty because we haven't included any of our content.

Get your content in

Delete the center of Matthew's code (lines 40 - 59 here) and replace it with

<%= :yield %>


Refresh now and you will see your scaffold in the middle. Nice.

Sidebars

Ok, we could have done all that easy, so let's get to the fun stuff.
Delete Matthew's code for either sidebar.
The center of your application.html.erb should now look like this:

            <div class="col1">

<!-- Column 1 start -->
<%= yield %>
<!-- Column 1 end -->
</div>
<div class="col2">
<!-- Column 2 start -->
<!-- Column 2 end -->
</div>
<div class="col3">
<!-- Column 3 start -->
<!-- Column 3 end -->
</div>



inside column 2 (the left nav in Matthew's layout) put this:

<%= yield :left_nav_content %>


In views/products/index.html.erb do this:
<% content_for :left_nav_content do %>
<h1>Here it is!</h1>
<div>Exciting left-nav content</div>
<% end %>



What is going on?

We all know about <%= yield %>, but adding a name to it says to look in your view (or any views including, even through partial rendering) and see if there is something that wants to write out to this space. <% content_for :left_nav_content do %> says that the stuff in here, please write there.

The power is - you can do anything in there. Include partials, for instance.

The advantage is that now you can not screw up your layout. You can still put things whereever you want, but the layout is out of the view's hands.

I hope this is clear. It made a big difference to how I construct views!

Wednesday, 11 June 2008

Ultrasphinx: ' != "

Just ran into a wee ultrasphinx problem.

I had been doing this:

@query = "actor:'#{@person.name}' OR director:'#{@person.name}'"


but it turns out that did nothing! You need to surround query segments in double quotes for it to actually parse them as a single unit:

@query = "actor:\"#{@person.name}\" OR director:\"#{@person.name}\""

Tuesday, 10 June 2008

Legitimate job posting?

I recently responded to a job posting I saw on jobs.rubynow. The company seemed to like me:

"Thank you for your interest in ***. After our first round of resume reviews, we are interested in moving you forward to our next round of evaluations. This is a significant accomplishment considering the competitive nature of this position, and the small number of candidates for a position that go to on – so congratulations! "

Wow! Great! Not exactly sure of the English at the end of the sentence there, but cool.

They then proceeded to ask me a list of questions... all of which I had already answered on the resume they loved so much! But I humored them and answered them again.
A few days later I got this back:

"A mini-test is a round between resume reviews and interviews, that helps the company get a sense of your real interest in the job, as well as your skills and abilities. Receiving a mini-test is a pretty significant accomplishment considering the competitive nature of this position, and the small number of candidates for a position that will receive one – so congratulations! "

Hmmm surprisingly similar to the first email.

They wanted me to write a test. Not really what I like to do, but perhaps I'd be interested. But here is the test:

3. Code as much of the system as you can within a time-box, which is a maximum of 10 hours in this case.

Uh, they want me to spend 10 HOURS???? on their test!?!

The real kicker for me was that they sent a spec for the test. It was to write them a new system for their HR department to track resumes! They didn't want a test, they wanted a free application!

They've made it sound like competition is fierce, so you'd better really do a good job. Now, most developers are proud of what they write and it is a remote test, so why not spend 20 hours or more and create an awesome system!

Or am I just being paranoid?

Oh, the also never answered any of my emails. I only got the bulk emails going out to candidates.

BTW, the company was Rich Applications Consulting,

Ultrasphinx cheatsheet: indexing habtm relationships

In the previous post I quickly summarized how I am doing the indexing of fields of a model. Quite simple I know.

I struggled a little bit with getting the other models I wanted indexed to come through.

Before, as I mentioned, I was using Ferret. Now, despite all of its memory-hog bugginess it is simple to understand because you can modify stuff at the post-sql stage. So before, for instance, I had a method on the film model that did this:

def participants_name
participants.collect{|a| a.name}.join(' ')
end


That gave me a big long string and I could pass that to acts_as_ferret and it would use that to find all the films with Johnny Depp in them.

The problem came with Sphinx not doing any post-sql processing. I wanted the same results, but how?
The model looks like this:

class Film < ActiveRecord::Base
has_and_belongs_to_many :participants
...
end



At first I thought I could index participants and then step through all of their films, but that seemed too much work.

So... here comes the cheatsheet part!

How to index HABTM relationships

Again, maybe this is in the documentation but I had a hard time piecing it together.

I do this:

is_index :concatenate => [
{:class_name => "Participant",
:field => "name",
:table_alias => "p1",
:as => "actor",
:association_sql => "LEFT OUTER JOIN films_participants ON (films.`id` = films_participants.`film_id`) LEFT OUTER JOIN participants AS p1 ON (p1.`id` = films_participants.`participant_id`)"}
]


So, where as the :fields array of is_indexed is used for the fields of a model, :concatenate is where you set up the habtm relationships.
To break it down
1 - :class_name => the name of the class that this model relates to. We need to set this because you can reference by a different name as we'll see in a moment

2 - :field => same as the fields table on this model, what field do we want to index on. Not it is called 'field' and not 'fields'! You can only do one.

3 - :table_alias => i.e. GODSEND! I have 2 Participant habtm in my film model - one for actors and one for directors. The directors :concatentate definition is almost exactly the same as the actors one. If we didn't have :table_alias Sphinx wouldn't be able to tell them apart. Remember, Sphinx uses all this cleverness to generate one (very clever!) bit of sql.

4 - :as => what you want to call this in your search. I can then write a query that says "actor:'Johnny Depp'" and it will search only on that field.

5 - :association_sql => yes, Virginia, I am sorry but you have to write your own sql. It will ALWAYS look like this, so just copy and past mine and replace the table and column names.


That's it.

You can now do a query like this:

@films = Ultrasphinx::Search.new(:query => "Johnny Depp", :class_names => ['Film']).run



And it will search for any film titles with Johnny Depp in them AND any actors in the film. Sweet!

Now, didn't I say something about directors?

I have another habtm relationship:
 has_and_belongs_to_many :directors, :class_name => "Participant", 
:join_table => "directors_films"


Same other model (Participant). So now my :concatenate will look like this:

:concatenate => [              
{:class_name => "Participant", :field => "name", :table_alias => "p1",
:as => "actor", :association_sql => "LEFT OUTER JOIN films_participants ON (films.`id` = films_participants.`film_id`) LEFT OUTER JOIN participants AS p1 ON (p1.`id` = films_participants.`participant_id`)"},
{:class_name => "Participant", :field => "name", :table_alias => "p2",
:as => "director", :association_sql => "LEFT OUTER JOIN directors_films ON (films.`id` = directors_films.`film_id`) LEFT OUTER JOIN participants AS p2 ON (p2.`id` = directors_films.`participant_id`)"}]


Two references to Participant, but :table_alias saves the day.

Monday, 9 June 2008

Ultrasphinx cheatsheet: indexing fields you need!

The Sphinx/Ultrasphinx combination is great for searching. Very fast and much less load on your database than other options. It also takes a lot less memory and is more reliable.

But, the documentation is maddening. Here is what I have learned about implementing it.

To index a field on a model:

is_indexed :fields => [{:field => :field_name, :sortable => true}]


The sortable lets you sort on it later in searches. The thing to remember is that sphinx knows nothing except what you explicitly tell it. So it won't know about fields you don't mention, or what you might want to do with those fields. Don't include the :sortable flag and you can't sort on it.

To that end, we have a field called 'available' that we set/unset if an item in the catalogue shouldn't be displayed online. Now, we never really want to search on that, but, we do want to include it in searches as a filter. So, the first line of our model is really this:

is_indexed  :fields => [{:field => :title, :facet => true, :sortable => true}, :available]


That means we can do this:

@films = Ultrasphinx::Search.new(:query => "title:Foo", :class_names => ['Film'], :filters => {"available" => 1}).run


And we'll just get a list of available films named Red (or with Red in the title).

Sunday, 8 June 2008

Ultrasphinx & UTF-8 data

On filmamora.com we obviously have a lot of data that is not your standard ascii - Spanish film titles and Actor's name especially.

After switching from Ferret to Sphinx as a search engine (more about this in another post) everything was going great using the ultrasphinx plugin. That is until I tested searching on a name like 'Belén'. No films were found with an actress with that name - though I knew there were loads of actresses with that name and therefore films.

The problem was the data settings in the database. Even though I was displaying the names correctly this was due to a transliteration between the characters in the database and the webpage. The data was not being stored correctly.

You MUST create tables as default charset utf8 to get the correct results:

Step 1) Create the table in a migration

create_table(:films, :options => 'DEFAULT CHARSET=utf8') do |t|
t.string :title, :null => false
....
end


Step 2) Tell rails that the database is in utf-8 by putting this in your database.yml:

development:
adapter: mysql
database: filmamora_development
encoding: utf8


Step 3) Make sure the data going IN is in utf8. Filmamora gets a lot of data from scraping our wholesales site (using scrapi), so I do this:

film_info_es = dvdgo_movie.scrape(URI.parse("#{server}/#{page}"),
:parser_options => {"char-encoding" => "latin0",
"output-encoding" => "utf8", "quote-marks" => "true" },
:user_agent => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12")


Step 4) search!
@films = Ultrasphinx::Search.new(:query => @query, :class_names => ['Film'],
:page => params[:page]||1, :per_page => 12,
:filters => {"available" => 1}).run

Now, if you were (like us) unlucky enough to have NOT done this from the start and can't re-run your migrations (we have a catalogue of 35,000 items!), you can do this to your tables:

ALTER TABLE mytablename DEFAULT CHARACTER SET utf8

We then deleted all our character data (film titles, actor's name, synopsis) and reimported. We kept our id's (which we needed to make sure our existing orders and wishlists, etc didn't go wrong) but got the new data.

In theory you don't need to set the charset on each table. In MySQL you can create a database with a default charset:

CREATE TABLE tablename DEFAULT CHARACTER SET utf8;

And then all your tables should use it too.

But, I don't think that rake db:create does this... (maybe it does now that I have set the character set in the database.yml... hmmm). I also am not sure if the rake db tasks to create tables explicitly set the charset or not. Anyone?

I'm going to write a few posts on sphinx/ultrasphinx because it is so awesome and I had some issues finding clear info on how to use it, so I might as well write it up.