RoR Ramblings: 2011

Sunday, 30 October 2011

installing sunport_rails on ubuntu production

We've added search functionality to Toygaroo with Solr via sunspot_rails.

If you are going to install it on Ubuntu then make sure you do:

sudo apt-get install openjdk-6-jdk

First!

Saturday, 1 October 2011

Using the html5 boilerplate and rails 3

I am experimenting with using the html5 boilerplate with rails 3.0x. It is an interesting setup. I am using Russ Frisch's html5 boilerplate template, but updating it to use the latest boilerplate code.

It seems to be a pretty nice way of getting your css and js ducks in a row. Over the next few days I need to see how it handles having jquery mobile in the mix. I am also not 100% sure how to use a some sort of grid layout template (like 960 or 114opx) with it. Or even if you should!

We shall see.

Wednesday, 28 September 2011

Suffering Rails3 slowdown

We've just pushed our Rails 3 upgraded app to production... and are suffering a massive slowdown in insert/update speed over Rails 2.

At the moment I am not sure of the exact cause of this.

It *might* be mysql inserts, though I can't quite see why that would be.

It *might* be because this new version we are using vestal_versions to track changes.

It might be because the moon is in the house of Mars for all I know!

I hate getting stung by unknowns. The speed on our test environment is tolerable, slightly slower than the rails 2 version, but I was willing to accept that because the new version is doing so much more.

Bench-marking is one thing.. but know why the bench marks are slower is the key!

Monday, 26 September 2011

Expiring fragments from daemons

We have an application that gets its data from a series of daemons that go out and read in data. This works great, except, we are caching pages. And I'd like to expire those pages based on an update.
It turns out that an Observer doesn't have access to expire_action or fragment. And a Sweeper is not called from data-only (i.e non-controller) updates! Buggers!

But there is a solution. You can call the sweeper directly from your importer:

MySweeper.instance.clean_up(model_instance)

This works, except I couldn't get it reliably to expire the actions. So, I used direct calls to Rails.cache.delete to do this.

Thinking about it, I guess I could then have just written an observer! As those do get called from controller-less updates.

Thursday, 21 July 2011

meta_search sort_link helper and associations

It took me a while to find this, so, for my own memory I am going to quickly write this up.

I have a view that shows a table of objects (a pretty standard index view). The only issue was I want to sort on one of the columns that actually has data coming not from the main object but an association.
What I discovered is a line in a posting here that says:

You can define your own custom sort scopes. Define scopes named “sort_by__asc” and “sort_by__desc” and sort_link @search, :name will work as you might expect.

So, I have an object of 'info' defined like this:

class Info < ActiveRecord::Base
  belongs_to :region
  scope :contains_string, lambda {|str| where(:name.matches % "%#{str}%")}
  search_methods :contains_string

  scope :active, lambda{where(:active => true)}
  search_methods :active, :type => :boolean
end

What I found is that you can put this at the end:

scope :sort_by_region_name, lambda{joins(:region).order("regions.name asc")}
search_methods :sort_by_region_name

and then in my view I can do:

<table>
    <thead>
        <tr>
            <td width="25%"><%= sort_link @info_search, :name, "Info" %></td>
            <td width="25%"><%= sort_link @info_search, :region_name, "Region" %></td>
            <td><%= sort_link @info_search, :active %></td>
        </tr>
    </thead>
        ...body info...
</table

And in my controller:

class Admin::InfosController < Admin::BaseController
  def index
    search = params[:search] || {"meta_sort" => "name.asc"}

    @info_search = Info.search(search)
    @infos = @linfo_search.paginate(:page => params[:page]||1, :per_page => 15)   # load all matching records
  end
end

And presto I have an index that
a) default to sorting by name (see the first line of the index method)
b) let's me sort on an associated value

Cool!

Friday, 1 July 2011

I'm sorry, but POW sucks

I have been (trying to) use Pow, and the Powder gem, for a few weeks now. At the outset it looked good: you get a local domain for you test on, you don't have to worry about deployments, etc etc.
But it sucks.
If you get an error in rendering (and we are using this in development, so of course this is going to happen) pow goes into a tail spin and on my MacPro it takes almost a minute to come out of it as it retries the request dozens of times.

So you end up with a huge log that looks pretty much like this:

  SQL (2.2ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (0.9ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (0.9ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (1.0ms)  SHOW TABLES
  SQL (1.3ms)  SHOW TABLES
  SQL (0.9ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (0.9ms)  SHOW TABLES
  SQL (1.4ms)  SHOW TABLES
  SQL (1.2ms)  SHOW TABLES
  SQL (1.4ms)  SHOW TABLES
  SQL (1.1ms)  SHOW TABLES
Rendered /Users/smyp/.rvm/gems/ree-1.8.7-2011.03@toygaroo_r3/gems/actionpack-3.0.3/lib/action_dispatch/middleware/templates/rescues/_request_and_response.erb (8663.2ms)
Rendered /Users/smyp/.rvm/gems/ree-1.8.7-2011.03@toygaroo_r3/gems/actionpack-3.0.3/lib/action_dispatch/middleware/templates/rescues/_request_and_response.erb (8728.0ms)
Rendered /Users/smyp/.rvm/gems/ree-1.8.7-2011.03@toygaroo_r3/gems/actionpack-3.0.3/lib/action_dispatch/middleware/templates/rescues/template_error.erb within rescues/layout (22759.6ms)
Rendered /Users/smyp/.rvm/gems/ree-1.8.7-2011.03@toygaroo_r3/gems/actionpack-3.0.3/lib/action_dispatch/middleware/templates/rescues/template_error.erb within rescues/layout (22758.7ms

I am going back to using Passenger standalone. Ok, you loose the pretty local domain, but it works much more reliably.

Monday, 2 May 2011

nginx, rails and ubuntu - 502 bad gateway

We were getting tons of 502 errors under load, but then I stumbled across a posting in a news group.

cat /proc/sys/net/core/somaxconn

Will show you how many connections you can have. This should be 1024, because Phusion Passenger is hard coded for this value. Mine was 128!

Do this:

sudo sysctl -w net.core.somaxconn=1024

And then restart nginx.

Sunday, 1 May 2011

get-flash-videos and osx (off topic)

I travel a lot and want to watch tv shows from the UK while I am doing so. There is get_iplayer that works nicely to get BBC shows, but on occasion there are things on itv (!) I watch. For this I use 'get-flash-videos'.

I had a tough time getting this working on OSX, so in a nutshell here is what I did:

1) Get 'get-flash-videos'
The home page is here. You pull that down somewhere onto your mac (I have it in a Downloads/get-flash-videos directory).

2) Update perl!
This is the key! I know nothing about perl, but here is what I did:

perl -MCPAN -e shell

This brings up a perl shell. Perl seems to have a package manager called CPAN. You will need to update this:

install Bundle::CPAN

Then you need to install Digest:SHA and Compress:ZLIB:

install Digest::SHA1
install Compress::Zlib

3) FLVStreamer and rtmpdump
Installing FLVStreamer is non-trivial and there are other guides about that.
I download rtmpdump from here
Then

chmod +x rtmpdump
sudo cp rtmpdump /usr/local/bin/.

4) Grab video!
now you should be able to do something like:

./get_flash_videos http://www.itv.com/itvplayer/video/?Filter=228293

And have it pull down an mp4. I use Handbrake to convert it for playing on the ipad.

Monday, 28 March 2011

Toygaroo on Rails

Seeing as there has been a lot of press recently about my company Toygaroo I thought I might throw out some tech info for those of you who care.
For those who don't know, Toygaroo is America's biggest toy rental company (think netflix for kid's toys). Recently - March 25th, 2011 - we appeared on the season premiere of Shark Tank - a national television show on ABC. About 4.6 million people watched the show.

The Platform
Toygaroo is a Ruby on Rails 3 application. It is based heavily on the code we wrote for FilmAmora.com, Spain's leading DVD rental company. We are running on Ubuntu 10.04 LTS. We're using Passenger 3 and Nginx (doesn't everyone?!). We're using Dalli in front of MemCached (though we've had some issues with this that I really should blog about!).

The Host
Right now we are running on Amazon EC2 service - though with Mark Cuban coming on board that might change. I am a big fan of EC2, though I think the machines are a little underpowered for what you pay.

The numbers!
In the 2 hours after the show aired we received around 70,000 page views. The basic architecture is a load balancer sitting in front of a whack of app servers. We are not using a scaling solution right now - hey, we're a start up! - so we wind up more servers if we feel we need them. It is a pretty simple process - I have a script for us to follow to get an Ubuntu server up and running in no time.

Caching
I looked into other solutions, like Varnish, but decided that Rails could handle the job with a combination of page, action and fragment caching. And I haven't been wrong so far. Even under heavy load we are getting great response time. The key - as I found out with FilmAmora - is what level to cache on. We cache 'blocks'. i.e. if you look at an index page with lots of toys we cache each toy block. That block can appear on many different pages, so it is a nice solution I find.

As time goes on I'd like to post more about how Toygaroo is coded and running as I think it will provide a nice real world example for what you can do with Rails 3. If you have any questions drop me a line (comment on here).

Saturday, 12 February 2011

Background File Processing Daemon in Ruby

I am writing this up because I scoured the net and could not find what I would have thought would have been a common thing to do.
We have an application that needs to watch several directories (on the server) and parse files that are placed there (via scp) by a third party. FWIW, these files represent sports betting prices.

Requirements

A background task that could be
* monitored
* run forever!
* process files instantly - parse them into ruby objects and store them into our database for use by the rails app

Our Rails app is written in 2.3.x (its been running for a while) and uses Bundler.

The Solution

After some poking around I decided to use a combination of the Daemons gem, EventMachine and FSSM.

The Daemon

This was inspired heavily by a posting on StackOverflow.

1) Install what you need
I tried to get this working successfully with Bundler, but it was a no go. So I needed to install daemons and eventmachine 'normally':
sudo gem install daemons eventmachine fssm

2) Setup the Daemon:

Setup

Usual stuff for a ruby file:

#!/usr/bin/env ruby
require 'rubygems'
require 'daemons'

We have multiple directories that need watching. So have an array:


watch = [
  "/Users/smyp/development/wl/xtf/horse", 
  "/Users/smyp/development/wl/xtf/sport",
  "/Users/smyp/development/wl/xtf/live",
  "/Users/smyp/development/wl/xtf/alpha"
]

if ENV['RAILS_ENV'] == 'production'
  watch = ["/home/mcdata/horse", "/home/mcdata/sport", "/home/mcdata/live", "/home/mcdata/alpha"]
end

We launch a separate daemon for each directory as we don't want a huge file in the horses directory to slow down processing in the live directory.

Daemon Config

With the daemons gem you can set things like what the process will be called. And where the pid file will reside, etc etc.

dir = File.expand_path(File.join(File.dirname(__FILE__), '..'))

daemon_options = {
  :app_name   => "xturf_file_monitor",
  :multiple   => false,
  :dir_mode   => :normal,
  :dir        => File.join(dir, 'tmp', 'pids'),
  :backtrace  => true
}

3) The Actual Daemon

Cue spooky music!

class PriceDaemon
  attr_accessor :base_dir
  def initialize(base_dir)
    self.base_dir = base_dir
  end

  def dostuff
    logger.info "About to start job for #{base_dir}"
    EventMachine::run {
      # Your code here
      xhj = PriceFileJob.new(base_dir)
      xhj.clear_backlog
      FSSM.monitor(base_dir) do
        create {|base, relative| xhj.clear_backlog}
        update {|base, relative| xhj.clear_backlog}
      end
    }
  end

  def logger
    @@logger ||= ActiveSupport::BufferedLogger.new("#{RAILS_ROOT}/log/price_file_monitor.log")
  end
end

What this does is:
a) create a class that takes the directory to watch as an initialize parameter
b) do an EventMachine run that first clears out any backlog files then fire up an FSSM monitor. The FSSM monitor gives us events on create, update (and delete, but we don't care about that). As a safety measure I simply trawl through the entire directory every time a file is created or updated. This ensures that anything we missed will get caught.
We delete files ourselves after processing, so the directory should only have a few files in it anyway.

4) Spawn the Daemon

Bring on Mia Farrow!

watch.each_with_index do |base_dir, i|
  Daemons.run_proc("price_daemon_#{i}", daemon_options) do
    Dir.chdir dir
    PriceDaemon.new(base_dir).dostuff
  end
end

This will go through our array and file up a daemon for each directory. There are downsides to doing it this way - its not so easy to start and stop one (but then they shouldn't ever die, so if they do we just start and stop them all).

5) The File Processor

This of course will be specific to your operation, but, here's an outline of ours:

class PriceFileJob
  attr_accessor :base_dir
  def initialize(base_dir)
    self.base_dir = base_dir
    logger.info "watching #{base_dir}"
  end

  def logger
    @@logger ||= Logger.new("#{RAILS_ROOT}/log/price_file_job_#{base_dir.split("/").last}.log", "daily")
  end

  def clear_backlog
    files = Dir.new(base_dir).entries.sort_by{|c| File.stat(File.join(base_dir, c)).ctime}
    files.each do |file|
      process_file(file)
    end
  end
  
  def process_file(file)
  end
  
  private
end

6. Capistrano

We use Capistrano to deploy, so I included some tasks in our deploy.rb

before "mc:release", "file_processors:stop"
after "mc:release", "file_processors:start"

namespace :file_processors do
  desc "start processors"
  task :start, :roles => :db do
    run "cd #{current_path}; RAILS_ENV=#{fetch :rails_env} ruby ./script/price_file_monitor.rb start"
  end

  desc "get status of processors"
  task :status, :roles => :db do
    run "cd #{current_path}; RAILS_ENV=#{fetch :rails_env} ruby ./script/price_file_monitor.rb status"
  end

  desc "stop processors"
  task :stop, :roles => :db do
    run "cd #{current_path}; RAILS_ENV=#{fetch :rails_env} ruby ./script/price_file_monitor.rb stop"
  end
end

That's it! I hope you found this interesting.

I should also write up how we monitor these processes... maybe next time!