Censor Search Queries

No, I don't want to censor what people are querying, I want to censor
the searches I display.

I display the last 5 queries entered in the search box on my website.
I do not want to display offensive terms - such as, the '7 words you
can't say on TV. I am already censoring the words themselves, but I
want to censor terms that contain those words.

In case you need a reason why: the webpage is on the website of a
major University. I want to provide students with search topic tips,
but do not want to display any offensive words.

This is what I am doing:
censor = get_censor # returns an array of offensive words
unless censor.include?(query)
mod.update_search

How do I extend this to filter out queries that contain the censored
terms, I am thinking of MYSQL pattern matching with like but am not
sure how to do it.

Thanks in advance, K

Come on - this is an interesting problem. Too easy? Too hard? Too
controversial?

Kim wrote:

Come on - this is an interesting problem. Too easy? Too hard? Too
controversial?

Well it certainly is interesting and can be a can of worms as it will
require regular maintenance. This is because there are many patterns
can be used to work around your censor.

I'd suggest you get to know regex pattern matching and see how
it can be done using MySQL functions. I think performance may
be an issue if you have a lot of content to filter.

If you can try to filter as you store content to DB or run an independent
background process that cleans up the DB content while you sleep.

-- Long
http://MeandmyCity.com/ - Free online business directory for local communities
http://edgesoft.ca/blog/read/2 - No-Cookie Session Support plugin for Rails

Yes my idea is to filter as I store in the DB. It seems like I should be able to use wildcards around a value to match it to a list of words. For example, if a user enters ‘ass’ the query will get filtered out and not displayed, but if they enter ‘assmonkey’ then it will not get filtered. If wildcards were used then it should filter it.

I do not expect to be able to filter every offensive query, but if I can get some of them I would be happier.

I am familiar with reg ex, but how would I apply it? See my first post.

Any other takers?

Am I correct in assuming you just want to match any censored phrase in
the query?

if so, it should be as simple as:

/#{censor.join('|')}/i.match(query)

-Shawn

Kim Griggs wrote:

Yes my idea is to filter as I store in the DB. It seems like I should be
able to use wildcards around a value to match it to a list of words. For
example, if a user enters 'ass' the query will get filtered out and not
displayed, but if they enter 'assmonkey' then it will not get filtered. If
wildcards were used then it should filter it.

I do not expect to be able to filter every offensive query, but if I can get
some of them I would be happier.

I agree. Though you may want to consider if the following is acceptable.
a s s
a # s # s
a.s.s
etc.

I hope you get the idea.

I am familiar with reg ex, but how would I apply it? See my first post.

Any other takers?

I think Shawn may have eluded to the start of a possible solution...

-- Long

I did a little searching, looks like there is a plugin that does much
of what you need:

http://locusfoc.us/2007/2/13/name-nanny-plugin

-Shawn

Shawn Roske wrote:

I did a little searching, looks like there is a plugin that does much
of what you need:

http://locusfoc.us/2007/2/13/name-nanny-plugin

Good find Shawn!

-- Long

looks good - I will try it out. Thanks to all.