I am a Sr. Software Developer at Oracle Cloud. The opinions expressed here are my own and not necessarily those of my employer.
Reporting frameworks
As the amount of data we deal with grows it is important to effectively present it. Users need to see high level summary and then drill in on specific details. It might not be glamorous but it’s essential for any organization.
- Charts
- Sorting / Filtering / Pagination
- Export
- Emailing reports
- Custom reports
- Access permissions
- Data pre-generation and archiving
- Reporting API
When I used .NET technologies in the past I really liked SQL Reporting Services. While the reports had fairly standard look they were easy to build with WYSIWIG tols and wizards. They also gave you features such as exporting and emailing reports right out of the box.
Unfortunately I have not been able to find such an integrated framework in Ruby world but there are lots of gems that allow you to build something much more customizable to your needs.
Let’s assume our application has Users (first_name, last_name, email), Articles (title, body, belongs_to user) and Comments (body, belongs_to user and article) models. All models have created_at and updated_at. You might need reports such as articles published by day, articles with most comments, users sorted by the number of articles they published, etc.
Charts
Charts help us to easily visualize data. There are many libraries out there but I like the simplicity of chartkick with Google Charts API.
# app/views/articles/index.html.erb
<%= line_chart Article.group_by_day(:created_at).count %>
Instead of putting business logic in our View layer I prefer to do this:
# app/views/articles/index.html.erb
<%= line_chart(ChartData.new.get_articles_by_day) %>
# app/services/chart_data.rb
class ChartData
def get_articles_by_day
# chartkick can pass data as Hash or Array
end
end
This allows you to build much more complex logic for gathering your data and test it w/in your Ruby classes. If you don’t like service objects you could put it in your models or decorators. Chartkick and Google Charts support many other chart types and configuration options, read their docs for more details.
Sorting / Filtering / Pagination
If you are not too particular about how the page looks, Jquery DataTables gives you a lot right of the box. Here is a RailsCast episode.
If you need more control over the presentation then you could build a custom form with form_tag
and modify index action to accept additional parameters.
# app/views/users/index.html.erb
<%= form_tag(users_path, method: :get) %>
<%= date_field_tag 'created_at', required: 'required' %>
<%= submit_tag %>
<% end %>
class UsersController < ApplicationController
def index
# check for additional filtering parameters
if params[:created_at].present?
@users = User.where("created_at > ?", params[:created_at])
else
@users = User.all
end
end
end
Another interesting gem is datagrid but I haven’t done much with it. For pagination I usually use kaminari. It works with many ORMs and is highly configurable.
Export
Even when we build highly visual and interactive dashboards users still need to export records into Excel.
Rake
When all you need is a basic report to dump data into XLSX it can be done via simple Rake task. Create lib/tasks/reports.rake
namespace :reports do
desc 'users joined last week'
task users: :environment do
# => query data
users = User.where("created_at > ?", Time.now - 1.week)
# => prepare output file
output_file = "tmp/users_joined_last_week_#{Time.now.strftime('%Y-%m-%d-%H-%M-%S')}.xlsx"
package = Axlsx::Package.new
sheet = package.workbook.add_worksheet(name: 'data')
sheet.add_row ['First Name', 'Last Name', 'Email', 'Joined On']
users.each do |u|
sheet.add_row [u.first_name, u.last_name, u.email, u.created_at]
end
# => save the file
package.serialize (output_file)
end
desc 'another report'
task report2: :environment do
# code here
end
end
I am using axlsx and axlsx_rails gems. You can run this task on demand or via crontab with whenever gem. Obviously this approach is not very scalable and hard to test. But it’s great for prototyping things.
Service Objects
For more robust solution I like to use Service Objects for report generators. I create folders app/services/reports and spec/services/reports.
class UserReport
def perform
users = User.where("created_at > ?", Time.now - 1.week)
# same code as in the rake task above
package.serialize (output_file)
end
end
# add this to application.rb to load new class
config.autoload_paths += Dir[Rails.root.join('app', 'services', '{**}')]
But then you need to create more reports. There will be lots of common code when it comes to saving and emailing files but the business logic of each query will differ. Object oriented programming to the rescue.
class BaseReport
def initialize
@output_file = "tmp/#{self.class.name}_#{Time.now.strftime("%Y-%m-%d-%H-%M-%S")}.xlsx"
end
def email_file
end
def archive_file_to_S3
end
...
end
class UserReport < BaseReport
def initialize
super
end
def perform
# report specific biz logic here
email_file
archive_file_to_S3
end
end
You could group BaseReport methods into before_ and after_ and use Active Model Callbacks.
You also want to give users ability to execute these reports via UI so you don’t have to run bash commands. For that I use ReportsController with index and create actions. You can download the file with send_data or send_file.
# config/application.rb
config.report_types = ['users', 'articles', ...]
# routes.rb
resources :reports, only: [:index, :create]
# app/views/reports/index.html.erb
<% Rails.application.config.report_types.each do |rep_type| %>
<%= link_to rep_type, "#{reports_path}?rep_type=#{rep_type}", method: 'post' %>
<% end %>
# app/controllers/
class ReportsController < ApplicationController
def create
result = ("#{params[:rep_type].capitalize}Report").constantize.new.perform
redirect_to reports_path, notice: result
end
end
Using duck typing I am calling the appropriate report. So if I add new report type I just include it in application.rb and create new service object following the naming convention. No need to change controller or UI.
Background Jobs
Some reports can take a long time to execute so we run them via background job using Sidekiq (or Resque or DelayedJob).
class ReportsController < ApplicationController
def create
# just throw it into queue and give feedback to the user
ReportJob.perform_later(params[:rep_type], current_user.email)
redirect_to reports_path, notice: 'you will receive your report shortly'
end
end
# the job now contains the logic of which report generator service object to use
class ReportJob < ActiveJob::Base
queue_as :default
def perform(rep_type, user_email)
("#{params[:rep_type].capitalize}Report").constantize.new.perform(user_email)
end
end
# same code as before only now you email results
class UserReport
def perform(user_email)
...
end
end
# to keep things simple I use the same email template for all reports
# app/mailers/my_mailer.rb
class MyMailer < ApplicationMailer
def report(user, subject, body, attachment)
# standard ActionMailer code
end
end
# create app/views/my_mailer/report.html.erb
Emailing reports
A nice feature is to enable users to opt-in to receive specific reports by email on periodic basis.
Frequency
We need a new model to store user selected reports / frequency on top of which you can build standard Rails scaffold with CRUD operations.
# config/application.rb
config.report_frequency = ['daily', 'weekly', 'monthly']
config.report_types = ['users', 'articles', ...]
# app/models/user_reports.rb
class UserReports
# Mongoid schema
belongs_to :user, index: true
field :frequency, type: String
field :report_type, type: String
extend Enumerize
enumerize :frequency, in: Rails.application.config.report_frequency
enumerize :report_type, in: Rails.application.config.report_types
end
Then we schedule a nightly job. To keep things simple users receive daily reports for previous day’s data, weekly reports on Monday (for previous week) and monthly reports on the 1st (for previous month).
# app/jobs/user_report_job.rb
class UserReportJob < ActiveJob::Base
def perform()
# query user_reports to determine reports need to be sent out to whom
# pass appropriate date range filters to report service objects
end
end
I like using sidekiq-cron to schedule the jobs but there are alternatives.
Format
The most common format I export data to is Excell. For PDF you can use prawn. Here is good RailsCasts episode.
Custom reports
Sometimes you have large customers that have unique data requirements that cannot be satisfied with the common reports. As much as it pains to write custom code, sometimes we need to do it. The important thing is to keep it separate from your main codebase.
I usually create folders such as app/services/custom/ and app/jobs/custom/ and place my code there. You can build these one-off reports quickly and email the files (have your customers create email distribution lists so they can add/remove recipients).
To expose these reports via UI you can create a config file to map which reports can be seen by whom.
# config/initializers/custom_reports.rb
CUSTOM_REPORTS_MAP = {
'customer1_id' => ['customer1_report1'],
'customer2_id' => ['customer2_report1', 'customer2_report2'],
}
# app/services/custom/custom_reports_map.rb
class CustomReportsMap
def get_reports current_user
# get current_user.customer_id and find list of reports in CUSTOM_REPORTS_MAP
end
end
# app/views/reports/index.html.erb
<% CustomReportsMap.get_reports(current_user).each do |rep_type|) %>
<%= link_to rep_type, "#{reports_path}?rep_type=#{rep_type}", method: 'post' %>
<% end %>
end
ReportsController code will use duck_typing to run the appropriate class. I have never actually implemented such solution in production so there are might be better ideas out there.
Access permissions
Different users might need to be restricted from seeing sensitive reports (financial data). You can use cancancan to define permissions like this:
class Ability
include CanCan::Ability
def initialize(user)
if user.roles.include? :admin
can :manage, :all
elsif user.roles.include? :salesrep
can :run_user_report
cannot :run_money_report
else
end
end
class ReportsController < ApplicationController
def create
# do something like this:
report_permission = "run_#{params[:rep_type]_report"
if can? :"#{report_permission}"
ReportJob.perform_later(params[:rep_type], current_user.email)
else
redirect_to root_path, alert: 'You are not authorized to run report'
end
end
end
For more flexible solution checkout punding, less DSL and more Ruby code.
Data pre-generation and archiving
Even with declining storage costs you eventually need to start purging some of the granular data and only retain high level summary stats. I personally like to pre-generate data in all granularity levels upfront and then delete the more granular data sets. Let’s say we are tracking article views.
class AricleViews
# records each visit and unique IP address
include Mongoid::Document
include Mongoid::Timestamps::Created # no updates needed
belongs_to :article
field: ip, type: String
index({created_at: 1}, {expire_after_seconds: 1.week})
end
class DailyArticleViews
# times each article was read per day. You can track IP separately
include Mongoid::Document
include Mongoid::Timestamps::Created
belongs_to :article
field: day, type: Date
field: counter, type: Integer
index({created_at: 1}, {expire_after_seconds: 1.month})
end
class MonthlyArticleViews
# times each article was read per month
include Mongoid::Document
include Mongoid::Timestamps::Created
belongs_to :article
field: month, type: String
field: year, type: Integer
field: counter, type: Integer
end
Use Mongo TTL indexes to expire data (no need for cron job).
Alternatively you could model data differenly using Mongo documents where each documents contains records for all dates for each article.
# create records
DailyArticleViews.update_one( {article_id: article.id},
{ "$inc" => { :"#{time.to_date}" => 1 } }, :upsert => true )
# { "_id" : ObjectId("..."), "article_id" : ObjectId("..."), "2016-01-01": 1, "2016-01-02": 2, ... }
# query for records
DailyArticleViews.find(:article_id => id)
.projection(:article_id => 0, :_id => 0).first
Another option is to rollup details stats to daily summaries (deleting detailed stats), then convert daily to monthly (and delete daily records). I personally find it more complex and error prone.
Reporting API
Often users need to extract data from your application and load it into another system. Excell can be a viable alternative at small scale but you will need a more robust solution as you grow.
Reporting APIs can be synchronous with data sent the data on the first request. It is simpler to build but it can timeout if report generation takes too long. Or you can create asynchronous API where client is issued a token, report generation is queued up and client needs to make second request to retrieve the data within certain time period (before file is purged).
class GenerateReportController < ApplicationController
def index
job_id = GenerateReportJob.perform_later(params).job_id
render json: { job_id: job_id }
end
end
class GenereateReportJob < ActiveJob::Base
after_perform :upload_file_to_s3
def perform(*args)
# query DB for data
end
private
def upload_file_to_s3
# use job_id as S3 object key
end
end
class DownloadReportController < ApplicationController
def index
render json: { error: 'need to provide job_id' } unless params[:job_id].present?
# use AWS S3 client to get file from S3
# if file is not found respond with appropriate message
end
end
end
Here is a good blog post on using AWS S3 Ruby SDK. You can delete files from S3 using object expiration policy.
Also I prefer to create separate URL endpoints for APIs (api.mywebsite.com vs mywebsite.com) so it can be directed to different load balancer using DNS (eaiser to scale if needed). You could even build API with different technology (express, sinatra, or rails-api).
Well, that’s it for this post, I hope you find it useful. All comments are welcomed.