I am trying to create a Rails app that will scrape a URL submitted in a form field. This will only be for URLs for one particular site that has the same structure on every page so it doesn’t have to be “dynamic” where it has to be that flexible (i.e. work for Forbes and Bloomberg). It would also be nice to save the text to the database with the link.
I’ve found many examples on how to do this with the URL hard-coded in the code, but I’m having trouble getting it to accept the link submission, grab the article text, put it in the database, and show the result.
Here’s what I have so far in my controller:
class LinksController < ApplicationController
before_action :set_link, only: %i[ show edit update destroy ]
def index
@links = Link.all
end
def show
end
def new
@link = Link.new
end
def edit
end
def create
@link = Link.new(link_params)
require 'open-uri'
page = Nokogiri::HTML(open(@link))
@article_text = page.css("div{itemprop='articleBody'}").text
respond_to do |format|
if @link.save
format.html { redirect_to @link, notice: "Link was successfully created." }
format.json { render :show, status: :created, location: @link }
else
format.html { render :new, status: :unprocessable_entity }
format.json { render json: @link.errors, status: :unprocessable_entity }
end
end
end
def update
respond_to do |format|
if @link.update(link_params)
format.html { redirect_to @link, notice: "Link was successfully updated." }
format.json { render :show, status: :ok, location: @link }
else
format.html { render :edit, status: :unprocessable_entity }
format.json { render json: @link.errors, status: :unprocessable_entity }
end
end
end
def destroy
@link.destroy
respond_to do |format|
format.html { redirect_to links_url, notice: "Link was successfully destroyed." }
format.json { head :no_content }
end
end
private
def set_link
@link = Link.find(params[:id])
end
def link_params
params.require(:link).permit(:web_address, :article_text)
end
end
If I need to provide any other code, please let me know and I will update the question.
Thanks!