Can someone help me with how to do this? What would the
"strip_code_snippets" method look like? I think I'd be fine finding the
<pre> tag but I don't know how to completely remove the text from <pre>
to </pre>. Any help would be greatly appreciated. Thanks in advance.
You basically create a regular expression to find the <pre> and </pre>
and nuke all the characters when you find the match.
I dont think my regex is correct. Advanced Ruby developers here can help.
>> my_string = "This is my little block of code <pre>puts 'I need to learn about Regexp'\nputs 'Will you help?'</pre>"
=> "This is my little block of code <pre>puts 'I need to learn about Regexp'\nputs 'Will you help?'</pre>"
>> regexp = %r{<pre\b[^>]*>.*?</pre>}m
=> /<pre\b[^>]*>.*?<\/pre>/m
>> my_string.gsub(regexp, '')
=> "This is my little block of code "
The square brackets [...] enclose a character set and [^...] enclose a negated set (not one of those characters. The previous posting isn't even well-formed syntactically.
Here's a little explanation to get you started:
regexp = %r{<pre\b[^>]*>.*?</pre>}m
<pre = literal character matching
\b = word boundary
[^>] = character set matching anything that's NOT a >
* = zero or more times
> = literal >
.*? = any character (.) repeated zero or more times, but as few as possible to let the regexp match (*?)
Note: this is getting rather advanced, you can look at the pickaxe pp.68-77
</pre> = literal character matching
The %r{ } is an alternate way to write a literal regular expression which I used in lieu of escaping the / in the /pre. You can see the equivalent form that irb printed as the value.
The 'm' at the end is a flag to match multi-line input. It turns the '.' from matching "any character except newline" to simply "any character".
Sorry if this gets dup'd, I haven't seen it hit the list after 8hrs.
Can someone help me with how to do this? What would the
"strip_code_snippets" method look like? I think I'd be fine finding the
<pre> tag but I don't know how to completely remove the text from <pre>
to </pre>. Any help would be greatly appreciated. Thanks in advance.
You basically create a regular expression to find the <pre> and </pre>
and nuke all the characters when you find the match.
I dont think my regex is correct. Advanced Ruby developers here can help.
>> my_string = "This is my little block of code <pre>puts 'I need to learn about Regexp'\nputs 'Will you help?'</pre>"
=> "This is my little block of code <pre>puts 'I need to learn about Regexp'\nputs 'Will you help?'</pre>"
>> regexp = %r{<pre\b[^>]*>.*?</pre>}m
=> /<pre\b[^>]*>.*?<\/pre>/m
>> my_string.gsub(regexp, '')
=> "This is my little block of code "
The square brackets [...] enclose a character set and [^...] enclose a negated set (not one of those characters. The previous posting isn't even well-formed syntactically.
Here's a little explanation to get you started:
regexp = %r{<pre\b[^>]*>.*?</pre>}m
<pre = literal character matching
\b = word boundary
[^>] = character set matching anything that's NOT a >
* = zero or more times
> = literal >
.*? = any character (.) repeated zero or more times, but as few as possible to let the regexp match (*?)
Note: this is getting rather advanced, you can look at the pickaxe pp.68-77
</pre> = literal character matching
The %r{ } is an alternate way to write a literal regular expression which I used in lieu of escaping the / in the /pre. You can see the equivalent form that irb printed as the value.
The 'm' at the end is a flag to match multi-line input. It turns the '.' from matching "any character except newline" to simply "any character".