How to split HTML content?
Well, the text like:
Ruby on Rails is a next-generation web-framework
YEAH!And of course Java killer?! :) ©
You can split to a pages in a smart way.
I tried to realize it in a gem for my internal use, and now when it works for my Rails 3.1.3 application, this is enough to make it public.
I hope it can be helpful.
Look more at https://github.com/addagger/HtmlSlicer
Feedback awaiting Thank you.
HtmlSlicer
A little gem for Rails 3 helps you to implement smart way to split textual content, the quick approach to create ‘pageable’ views for ActiveRecord Model’s attributes or just independent strings or any other Ruby classes. Or course it can split HTML content. More over, it can ‘resize’ HTML tags having width=
attribute*.
* Imagine you want to resize <iframe> embeddings from YouTube saved in static content.
Install
Put this line in your Gemfile:
gem 'html_slicer'
Then bundle:
% bundle
Implementation
Basic appoach
slice <method_name>, <configuration>, [:config => <:style>]*
where:
-
<method_name> - any method or local variable which returns source String (can be called with .send()).
-
- Hash of configuration options and/or
:config
parameter.
Basic example
class Article < ActiveRecord::Base
slice :content, :as => :paged, :slice => {:maximum => 20}, :resize => {:width => 300}
end
where:
-
:content
is an attribute accessor for Article which return a target String object. -
:as
is a name of basic accessor for result. -
:slice
is a hash of +slicing options+ as a part ofconfiguration
. -
:resize
is a hash of +resizing options+ as a part ofconfiguration
.
You can define any key of configuration you want. Otherwise, default configuration options (if available) will be picked up automatically.
Console:
@article = Article.find(1)
@article.content
# => "Words like violence break the silence\r\nCome crashing in into my little world\r\n<iframe width=\"560\" height=\"315\" src=\"[http://www.youtube.com/embed/ms0bd_hCZsk\](http://www.youtube.com/embed/ms0bd_hCZsk\)
" frameborder=\"0\" allowfullscreen></iframe>\r\nPainful to me, pierce right through me\r\nCan't you understand, oh my little girl?"
@article_paged = @article.paged
# => "Words like violence bre"
-
the
nil
argument assumes it is number1
.@article_paged.slice!(2) # => “ak the silence”
- the passed slice number is remembered.
@article_paged.slice!(4) # => “rld ”
Configuration options
All configuration keys:
-
:as
is a name of basic accessor for slicedobject
(result). -
:slice
is a hash of slicing options. -
:resize
is a hash of resizing options. -
:processors
- processors names. -
:window
- parameter for ActionView: The “inner window” size (4 by default). -
:outer_window
- parameter for ActionView: The “outer window” size (0 by default). -
:left
- parameter for ActionView: The “left outer window” size (0 by default). -
:right
- parameter for ActionView: The “right outer window” size (0 by default). -
:params
- parameter for ActionView: url_for parameters for the links (:controller, :action, etc.) -
:param_name
- parameter for ActionView: parameter name for slice number in the links. AcceptsSymbol
,String
,Array
. -
:remote
- parameter for ActionView: Ajax? (false by default) -
:config
- special key for using stylized configuration (premature configuration).
Slicing options
-
:unit
is aRegexp/String/Hash
description of text’s units counted to split the text by slices.
When value is a Hash
, it assumes the unit is a HTML tag (look at :only/:except options for details). Undefined value or nil
assumes it default Regexp /&#?w+;|S/. As you see it counts any regular character/or HTML special character as a unit.
-
:maximum
is aFixnum
number of units to be a one slice.
If :unit
defined as Regexp or String, default value is 300.
If :unit
defined as Hash, default value is 10.
If :unit
is default, default value is 2000.
-
:complete
is aRegexp
description of a character used to complete the slice.
For example in case you want to end the slice with the complete word, using :complete => /s+|z/ the counter would continue the slice until the first whitespace character.
-
:limit
- aFixnum
limit number of slices.
In many cases we just need the first slice to perform it as a partial.
-
:only
is aHash
orArray
of hashes, describes which exactly nodes of HTML content to slice.* -
:except
is aHash
orArray
of hashes, describes which exactly nodes of HTML content NOT to slice.*- Actually the hash is a argument for HTML::Conditions class (the part of ActionPack’s html_scanner block). Look atgithub.com/rails/rails/blob/master/actionpack/lib/action_controller/vendor/html-scanner/html/node.rb
This is a very flexible utility to navigate via HTML content. Read native documentation for details.
For example: ID for
<hr class="break"> tag
is a hash:{:tag => "hr", :attributes => {:class => "break"}}
Resizing options
-
:width
is a Fixnum number of pixels as a target value to squeeze the HTML tag. It does automatically proportional with the:height
(if existed). The percentage values ignored. -
:only
is aHash
orArray
of hashes, describes which exactly nodes of HTML content to resize. -
:except
is aHash
orArray
of hashes, describes which exactly nodes of HTML content NOT to resize.
Processors
Used to transform the source text before it sliced. Many of us are using any markup languages for dynamic contents. This is it. Just create any class as a subclass of HtmlSlicer::Processor, put it in /lib/html_slicer_processors
directory and define its name within the :processors
option.