I need help saving table data from a rake task

The matches are:

to_teamid | to_ppcs
ro_teamid | ro_ppcs
po_teamid | po_ppcs
so_teamid | so_ppcs
etc.
etc.
.. for 14 total rake subtasks

each of the matches contain 120 teams and 120 PPCS values corresponding
to that column/field.

I need to organize the entire dataset so that..

to_ppcs values go into the to_ppcs column where to_teamid == team_id
ro_ppcs values go into the ro_ppcs column where ro_teamid == team_id
po_ppcs values go into the po_ppcs column where po_teamid == team_id
so_ppcs values go into the so_ppcs column where so_teamid == team_id
etc.
etc.
.. for 14 total arrays..

Each array contains exactly 120 rows of data..

I hope this additional information helps.

I think that you aren't getting much response because:
   A) you aren't giving the right amount of detail
   B) you seem to be asking for design help

From your other thread, http://pastie.org/548692 shows that you are calling a .compiled_this_week method on each of several models, but you don't show that code (or even the inspect'ed array of data that you have).

I think that a rake task is probably the wrong way to approach this. It's more likely that you want to do a script/runner Tsos.update (for some method 'update' on the class Tsos). Your approach seems very much procedural and not at all object-oriented.

If you back up just a bit and give some of your assumptions (like how you get data into the database in the first place), you might get a few useful hints as to how to proceed. Since you are asking your questions on the Rail mailing list (or posting to that forum), I'm going to assume that you do actually have a Rails application surrounding this data, however, you haven't really been asking Rails questions so far. They've either been Ruby questions or design questions. That might be part of the problem, too.

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

Hi Rob,

Here are my assumptions:

I utilize Rake Tasks in the same way as one would a cron job - to
perform a procedural task on "something" once per week. The rake tasks
I create will be manually run at the start but in the end, I will have
cron jobs run each of the rake tasks that I create one to three times
per week.

What do my rake tasks do?

1. Rake task one reaches out to a number of official ncaa sites,
parses/scrapes all of the data for the given week and uploads that data
into 37 statistical tables.

2. Rake task two (the one I'm working on now which is still incomplete)
compiles a ratings strength number for each team listed in each
statistical table (37 tables), categorizing the data by offense,
defense, special teams, and turnover margin. This data is then saved
into their respective "ratings" tables (offense, defense, special teams,
turnover margin).

3. Rake task three (is not designed yet but will go through the user
tables and purge membership accounts based on a weekly subscription
format.

These are the only 3 rake tasks I plan on using with my site for now.

About Design:

I'm really not asking for design help. I know how I personally see it
and how it "should" work. I'm asking if it can be done in the way I've
designed it. So far, all of the rake tasks I've created work perfectly
for what I've designed them to do thus far. You are correct in that I
have not supplied a lot of code with respect to rake task #2. The fact
is, I cannot supply more than the mechanics of it. The ratings system
is a private system that includes a lot of math functionality and is the
core reason why my site works so well.

Quoting Kenneth Massey (One of the BCS computer gurus from an email he
wrote to me):

Hi Joel (at least now we know your name :wink: )

You are correct in that I have not supplied a lot of
code with respect to rake task #2. The fact is, I
cannot supply more than the mechanics of it. The
ratings system is a private system

A lot of us here have day jobs that preclude us posting 'real' code.

The solution to the dilemma we, and you, face (i.e., needing help and
not being able to supply the information folks here need to have to
provide it) is what's known as 'a sandbox.'

What you do is create a separate app that's scaled down to include only
the components that are absolutely necessary to demonstrate the problem
you're having. That's what you post and ask questions about.

One of the things that has made it difficult to assist you is the excess
of information you're providing. We're here to help but, as I'm sure
you've noticed, there are more folks asking questions here than
providing answers. The more specific you make your problem statements,
the more likely it is that we'll be able to help without leaving others
unattended.

HTH,
Bill

Can I make a recommendation? Write some tests. That way you can tell
what exactly is passing and what is failing your expectations. It may
help you find your problem. It might be even easier for others to help.

On the other hand, if you want to push forward, it seems like the
pattern you are looking to implement is:

- Grab data from an external data source and calculate statistics for
some teams, each with a known unique arbitrary identifier.
- Find the existing data for that same team and update each row
matching the arbitrary identifier with the new data.

Questions:

- Is the above an accurate assessment?
- Do you have the first part handled?
- Do you have any of the second part handled?

This would help me and possibly others on the list understand where
you think your progress has brought you and help us get you unstuck.

Steve Ross wrote:

Can I make a recommendation? Write some tests. That way you can tell
what exactly is passing and what is failing your expectations. It may
help you find your problem. It might be even easier for others to help.

The tests aren't the issue Steve. All tests prior to this point work
100%. The issue is I do not "know how" to update [one] table with 14
arrays of data.

- Grab data from an external data source and calculate statistics for
some teams, each with a known unique arbitrary identifier.

No - not accurate. I already pulled the data. The data exists in my
"personal" tables (37 of them). This data is exact and completed from a
development and testing scenario.

- Find the existing data for that same team and update each row
matching the arbitrary identifier with the new data.

Find the existing data within 1 of 37 tables, perform mathematical
calculations on that data, assign a ratings value for each team, send
the data back to rake to be held in "queue" [persistent data].

Find the existing data within 2 of 37 tables, etc., etc. to be held in
"queue" [persistent data].

Do this for 14 of 37 tables. These 14 tables now make up 28 arrays (14
arrays that are paired with another by foreign_key).

Array 1 houses the team_id for table 1
Array 2 houses the rating for team_id for table 1

Array 3 houses the team_id for table 2
Array 4 houses the rating for team_id for table 2

etc...

I have all of these arrays populated with data, all verified through
testing, and all that check out perfectly...

So if I were to abstract this one level, I would say: “you want a container-like data structure that can describe a unique odd-numbered team id and 14 ratings for even-numbered teams. By iterating this container, you will then update each database row that corresponds to the team id.” Is this a correct description of your goal?

Sorry if I’m not getting this.

I think the underlying difficulty is that you need to learn about a collection other than Array :wink:

Take a look at the docs for Hash.

It is sounding like you want a representation like:

[{ :team_id => 1, :desc_of_array_2 => 42.14, :desc_of_array_4 => 18.97 },
  { :team_id => 2, :desc_of_array_2 => 97.32, :desc_of_array_4 => 49.97 },
...
  ]

or even a hash that maps team_id to its set of stats like:

{ 1 => { :desc_of_array_2 => 42.14, :desc_of_array_4 => 18.97 },
   2 => { :desc_of_array_2 => 97.32, :desc_of_array_4 => 49.97 },
   ...
   }

This collection kinda looks like an array when accessed because the method is [] for both Array and Hash. If you want the array 12 value for team 87, you'd have (assuming that the hash is in a variable called stats):

stats[87][:desc_of_array_12]

I'm assuming that you'd have more "natural" names for the :desc_of_array_N

Note that I'm using :symbols, but you could use 'strings' instead.

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

Just saw your other thread on the ruby list, but I'll answer here, too.

Put this at the end of your pastie and see if it helps you see.

# referencing a new key causes an empty has to be stored as the value
stats = Hash.new {|h,k| h[k] = {} }
# i will take on the same values as 0.upto(119)
120.times do |i|
   stats[to_team_id[i]][:to] = to_ppcs[i]
   stats[ro_team_id[i]][:ro] = ro_ppcs[i]
   stats[po_team_id[i]][:po] = po_ppcs[i]
   stats[so_team_id[i]][:so] = so_ppcs[i]
   stats[rzo_team_id[i]][:rzo] = rzo_ppcs[i]
   stats[flo_team_id[i]][:flo] = flo_ppcs[i]
   stats[pio_team_id[i]][:pio] = pio_ppcs[i]
   stats[too_team_id[i]][:too] = too_ppcs[i]
   stats[sao_team_id[i]][:sao] = sao_ppcs[i]
   stats[tflo_team_id[i]][:tflo] = tflo_ppcs[i]
   stats[peo_team_id[i]][:peo] = peo_ppcs[i]
   stats[fdo_team_id[i]][:fdo] = fdo_ppcs[i]
   stats[tdco_team_id[i]][:tdco] = tdco_ppcs[i]
   stats[fdco_team_id[i]][:fdco] = fdco_ppcs[i]
end

puts stats.inspect
# or this might be easier to read
require 'pp'
pp stats

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

From Rake, I believe all I need to do is this:

update_tsos_offense.table_update(TsosOffense, stats)

Which will continue with the open object, call the table_update method
in TsosOffense model, and pass stats which holds the array.

Then, in the TsosOffense model tabel_update method, I can iterate over
the stats array passed...

Still some work to do but I believe it should be as simple as this..

Correct me if I'm wrong.

Thanks,

Hi guys/gals,

Okay some good news. I'm able to pass the information to the correct
model and inspect it. I just don't know how to iterate through this
type of array. As it contains a hash setup, I'm not as experienced with
this piece. Could someone give me some pointers on how to iterate
through this data in my update_table method?

Here's what I have so far:

def table_update(model, constant, array)
  #puts array.inspect
  if model.compiled_this_week.find(:all).empty?
    puts "Updating #{model} for the following teams:"
    array.each do |row|
      values = {:compiled_on => Date.today.strftime('%Y-%m-%d')}
      constant.each_with_index do |field, i|
          values[field] = row[i]
      end
      model.create values
    end
  else
    # data is already populated for the week so don't update
    puts "Current Week's Ratings are Already updated!"
  end
end

compiled_this_week is just a scope that checks for between dates and I'm
finding out if the table is empty. If it is empty for the specified
dates (current week basically) I populate the table with new data..

"constant" refers to a constant I have setup in my environment.rb file
which houses the fields I'm going to populate in the table:

TSOS_OFFENSE = [:team_id, :totoff, :rushoff,
:passoff, :scoroff, :rzonoff, :fumlost, :passhint,
:tolost, :sacksall, :tackflossall, :passeff,
:firdwns, :thrdwncon, :fthdwncon]

"array" refers to the stats that were pulled in rake. They look and
appear just like the following if I perform a puts array.inspect:

{18=>{:tolost=>15, :passoff=>195.5, :fthdwncon=>44.44,
:sacksall=>2.42, :scoroff=>27.0, :tackflossall=>6.08,
:rzonoff=>0.85, :passeff=>130.13, :fumlost=>7,
:firdwns=>19.83, :totoff=>398.83, :passhint=>8,
:thrdwncon=>37.34, :rushoff=>203.33}

55=>{:tolost=>17, :passoff=>121.5, :fthdwncon=>40.0,
:sacksall=>2.42, :scoroff=>18.08, :tackflossall=>4.38,
:rzonoff=>0.91, :passeff=>94.95, :fumlost=>9,
:firdwns=>13.67, :totoff=>270.17, :passhint=>8,
:thrdwncon=>28.85, :rushoff=>148.67}

etc...

I just don't know how to iterate through this type of array. As you can
see the array houses the exact names of my constant...

Any help would be appreciated...

Okay, after testing and testing, I finally managed to get it all to
work. However, I'm sure my way is very clumsily implemented but it was
the only way I understood how to read the values and place them into the
table.

I called the following from Rake:

update_tsos_offense.table_update(TsosOffense, stats) # model, # array

And in the model for table_update I did:

def table_update(model, array)

  if model.compiled_this_week.find(:all).empty?
    puts "Updating #{model} for the following teams:"
    120.times do |i|
      team = Team.find(i + 1)
      values = {:compiled_on => Date.today.strftime('%Y-%m-%d')}
      values[:team_id] = i + 1
      values[:totoff] = array[i + 1][:totoff]
      values[:rushoff] = array[i + 1][:rushoff]
      values[:passoff] = array[i + 1][:passoff]
      values[:scoroff] = array[i + 1][:scoroff]
      values[:rzonoff] = array[i + 1][:rzonoff]
      values[:fumlost] = array[i + 1][:fumlost]
      values[:passhint] = array[i + 1][:passhint]
      values[:tolost] = array[i + 1][:tolost]
      values[:sacksall] = array[i + 1][:sacksall]
      values[:tackflossall] = array[i + 1][:tackflossall]
      values[:passeff] = array[i + 1][:passeff]
      values[:firdwns] = array[i + 1][:firdwns]
      values[:thrdwncon] = array[i + 1][:thrdwncon]
      values[:fthdwncon] = array[i + 1][:fthdwncon]
      model.create values
      puts "#{team.name} values are being saved."
    end
  else
    # data is already populated for the week so don't update
    puts "Current Week's Ratings are Already updated!"
  end
end

I had to add 1 because the i count started at 0. I also couldn't use
the constant or iterate using each or each_with_index because one, I
couldn't get it to sort.

This way does work though and so I'm happy that it at least is
functioning. Although, I'm sure it can use some cleanup.