Unexpected results when validating the length of Action Text content?

I’m running into a validation issue with Action Text and I’m wondering, is there’s something I’m missing?


I have a model that has a length validation on some Action Text content:

class Post < ApplicationRecord
  has_rich_text :content

  validates_length_of :content, maximum: 500

When I type 500 characters into the text area field, I’d expect the validation to pass, but it actually fails because the validation runs on the 500 characters from the input + the length of characters in the partial layouts/action_text/_content.html.erb.

The root issue seems to be caused by the validation calling to_s on the content attribute, which is a custom method on the Action Text content that includes the partial content (layouts/action_text/_content.html.erb) in the returned string.

I’ve worked around this by using a custom validation method, but is there a better way to handle this scenario? Has anyone else run into this issue or is this the expected behavior?

1 Like

Two issues here.

First, don’t forget that invisible characters (space, carriage return, tab etc.) count.

Second: Your text could contain some characters encoded with more than 8 bits. For instance the character “é” is encoded with 16 bits (using UTF8 encoding scheme). Some characters (like emojis) could be encoded with 24 or 32 bits. Many implementations of the count or length function assume the text is ASCII encoded and will count these 16, 24 or 32 bits encoded characters as 2, 3 or 4 when they are actually displayed as 1 character. Counting UTF8 characters could be a challenging task.

And if you use Trix the situation is even worse.

When you use Trix, it stores the actual HTML code describing your rich text, not the plain text you may have typed. Even if you don’t add any formatting at all, the words you type will be surrounded by at the very least a div tag, which adds a minimum of <div>your text</div> 11 non-text characters – and likely many more with various trix-specific classnames and data-attributes.

You could try to add a JavaScript character counter to the trix editor, but it would need to inspect the visible composition area, and try to get only the plain-text equivalent from the contents. The textarea that Trix hijacks to build its editor will contain the full HTML version as its value, which is useless for what you want to do.

If you wanted to build a completely custom validator on the Ruby side, you could start by running strip_tags on the field content and counting characters in that. But if you are trying to ensure that a stored string stays below a certain number of characters, say, so you don’t have a database error from putting too much into a field, you will have a very hard time with this, as the limit you set will have to include some percentage of slop to account for the non-textual content.


1 Like