RoR on Amazon Elastic Compute Cloud (ec2)?

I did a search of the group and saw a few posts about experience & integration of RoR with the Amazon S3 data service, but curious if anyone is currently serving a RoR app from the Amazon EC2.

Any experiences? It sounds interesting.

I've used it in a testing environment.

We were looking at using EC2 to do scale. EC2 doesn't have persistent
disks so for a mainline Web hosting solution, it's not really
workable at this point (this might change).

We want our Rails app to be able to sense high load, start up some
EC2 instances, send them some data and start serving. All the tech is
there, it's just a matter of stitching it together.

Very, very cool stuff.

I think a number of people would like something like this.

The biggest issue I can see here is the database access. I think having a single permanent server configuration with a set of EC2 images using some sort of VPN to your permanent database server will be the way to go.

I've seen no discussion of anyone doing this work yet, but there are some cool tools showing up for managing EC2 instances.

If your application is mostly view only, you could get away with
shipping a copy of the DB to the EC2 instance at init and then having
it pump out data with some kind of fixed reload schedule.

For interactive apps, it's more difficult but if you can find a way
to segment your app (i.e. all 'store' requests stay on the local
cluster, 'read' requests go to EC2 cluster), you're in there.

A lot of apps we do are read heavy (sites for large rock bands,
labels, etc...) so this model is workable for a lot of what we do.
Plus, other apps that need clients to do stuff like process large
data sets or some distributed network tasks - EC2 is perfect. Combine
it with Amazon SQS (their queue) and you have a great setup

Do you have any links to any of these tools?

I know Marcel and others working on the S3 library have talked about abstracting out the common bits and doing interfaces for SQS and EC2. There is also another Ruby plug-in for SQS (SQS is a very simple interface ... push and pop and that's about it).

DB access to a read/write DB is doing to be a performance killer.

Since most applications make several DB requests/page, keeping latency low between application server, and DB server is *critical*.

I once saw a site get expanded from one machine to two, with web/app on one and DB on the second. Performance was *terrible*.

Turns out, a large box hosting provider had assigned a second box in a separate datacenter from the first. ping times between the two were under 10ms, typically around 4.5-5.5ms (which is very low!) but that was enough to make the site feel very pokey, indeed!

Since most applications make several DB requests/page, keeping latency low between application server, and DB server is *critical*.

Thanks to all for your responses!

Sounds like the main questions are persistance of your server image & latency.

I´m curious what the latency is between the EC2 service and the S3 service. Being that they are designed to work together, I would imagine Amazon would try to keep it pretty low.

> Since most applications make several DB requests/page, keeping latency > low between application server, and DB server is *critical*.

...

I´m curious what the latency is between the EC2 service and the S3 service. Being that they are designed to work together, I would imagine Amazon would try to keep it pretty low.

They should be, but your S3 storage may not be enough (it's not a relational db [1]) so it may be easy if most of what you are serving is storable on S3 (e.g. attachments/media files/... ). Otherwise you'd have to connect to some db somewhere else.

bye Luca

[1] some kind of persistence software for S3 may emerge but nothing I have seen so far is generic enough to be a complete substitute for a relational Db

Since most applications make several DB requests/page, keeping latency low between application server, and DB server is *critical*.

...

I´m curious what the latency is between the EC2 service and the S3 service. Being that they are designed to work together, I would imagine Amazon would try to keep it pretty low.

As I mentioned, even very low latency (4-6ms) is not low enough for DB usage.

With EC2 and S3, you simply don't know where on the planet your files are stored, or where your computer is. It may be small enough now that you do, but in the future, you really won't know, and I guarantee you that as they grow the location will grow indeterminate.

They should be, but your S3 storage may not be enough (it's not a relational db [1]) so it may be easy if most of what you are serving is storable on S3 (e.g. attachments/media files/... ). Otherwise you'd have to connect to some db somewhere else.

bye Luca

[1] some kind of persistence software for S3 may emerge but nothing I have seen so far is generic enough to be a complete substitute for a relational Db

And I haven't seen anything that mimics a filesystem, either...

I'm afraid that is a myth.

The EC2 disk is persistent until the instance is terminated via a terminate command, a normal shutdown or an operating system failure. It persists across reboots quite happily.

So if you never shut it down, it is just like any other server - except that it has 1.75Gb of memory and 160Gb of disk space for 10 cents an hour. Plenty of room to let your mongrels roam.

So the way you approach EC2 is to treat it like a normal server, use the S3 as a backup area and treat the few, if any, shutdowns you have as you would a filesystem failure.

If nothing else it makes you think about the design of your backup/restore fire drills.

I've had the good old 'depot' application running in an infrastructure separated multi-tenant configuration (ie each tenant is a separate Unix and MySql user with their own database/codebase), all hanging quite happily off a dynamic domain. The EC2 machine runs the usual infrastructure and about 60 mongrels without causing too many problems, although I haven't really clobbered an instance yet with any serious tests to see if the database paths hold up under stress.

I have Debian Etch AMIs on S3 at the moment that I am using for testing purposes, and I have a couple of Rubyforge open source projects running - vmbuilder and multi-tenant. Anybody interested in pushing forward the state of the art in EC2 Rails images and hosting is welcome to contribute: http://rubyforge.org/forum/forum.php?forum_id=10902

NeilW

And why would that be a problem if the response is adequate?

I don't know if I would call it a myth. Many potential EC2 users take
pause due to this detail. It's not something I just made up.

What you say is correct but the lack of a real, persistent disk is
something that is discussed on the EC2 forums all the time. For some,
the current setup is workable but for a lot of solutions, the disk
going away under the circumstances you outlined is a problem. It
sounds like they are using commodity hardware for EC2's nodes - even
a minor failure isn't impossible and in the EC2 scenario, your data
goes away where with a 'real' persistent disk, there's at least a
chance it would survive.

Of course nothing takes the place of backups but when restoring from
S3 would take minutes (sometimes many minutes), not having a disk
starts to hurt for some applications.

There's lots of talk of ways to hack filesystems on to EC2 using S3,
etc...

One interesting thread: Forums? threadID=12761&tstart=165

Don't get me wrong, I think EC2 is very cool... I sounds like if they
added a few features to either EC2 or S3 that these little annoyances
would go away... From what I read, Amazon is listening and wants to
keep improving things.

It sounds like they are using commodity hardware for EC2's nodes - even a minor failure isn't impossible

It isn't impossible anywhere. I can promise you that the bigger they are the harder they fall, and the more resilient they promise to be the less rigorous the backup regime...

I think having scary metal underneath you is a good thing. It makes you think about the things that can go wrong.

Of course nothing takes the place of backups but when restoring from
S3 would take minutes (sometimes many minutes), not having a disk
starts to hurt for some applications.

Minutes, hey. How the world changes...

However for the vast majority of applications it ain't a problem. I can absolutely guarantee you that the world will not stop turning if a website is out of action for a few minutes once in a blue moon. Last time I checked people still go to the bathroom.

The people who decide whether it is an issue or not are not the technicians but the ones writing the cheques. You go to them with the numbers. They can either pay $80ish dollars a month for an EC2 instance or they can pay Tom $250 per month for a slot on his excellent Engine Yard where he will quite happily guarantee lots of things for them in return.

But for the majority of web applications I reckon the EC2 disk is persistent enough and that the EC2 architecture is good enough.

There's lots of talk of ways to hack filesystems on to EC2 using S3,
etc...

Don't get me wrong, I think EC2 is very cool... I sounds like if they
added a few features to either EC2 or S3 that these little annoyances
would go away... From what I read, Amazon is listening and wants to
keep improving things.

Yes, I know they do. But it ain'

How is yours different from the existing "EC2 plugins.. i.e.

http://rubyforge.org/projects/amazon-ec2/

They launch EC2 instances.

vmbuilder builds the Rails AMI which you use to launch EC2 images. multi-tenant deploys Rails applications automatically on top of launched EC2 images

I may end up incorporating EC2 plugins yet into multi-tenant, but it isn't top of my list.

So there's your first job if you want it. :slight_smile:

Furthermore I came across the following site which is kinda cool but sign-up doesn't work now but they have screencast :=)

http://info.aws-console.com/

Whats your view about them.. I been thinking about EC2/S3 for my project hosting but haven't yet reach the point of deployment yet. :slight_smile: As you have done that and what you are saying here makes lot of sense to me.. I like to hear more about the above ..

Never used it. Looks like a pretty front-end to the launch tools.

NeilW

Howdy! Jonathan here from AWS-Console.com.

> Furthermore I came across the following site which is kinda cool but > sign-up doesn't work now

Our apologies on this--on the morning of the 25th we were getting an error back from an EC2 call that is part of our signup process. We notified Amazon of the issue and they resolved it in a few hours. You may have gotten a follow-up message from us to return and signup (we sent out a number of these). If not, you are welcome to return now.

We are a fairly unique use-case for EC2 since we create a large number of security groups--and we think this usage uncovered an edge case in their system. We intend to remain a beta product as long as they do!

>http://info.aws-console.com/

> Whats your view about them.. I been thinking about EC2/S3 for my project > hosting but haven't yet reach the point of deployment yet. :slight_smile: As you > have done that and what you are saying here makes lot of sense to me.. I > like to hear more about the above ..Never used it. Looks like a pretty front-end to the launch tools.

Thanks for the mention here! A little background on us--we were having a string of bad luck with hosting providers around the time that EC2 was released. We did our initial investigation into the platform and, although very exciting, we saw two really serious issues--the dynamic IP assignment (there are some rare, but problematic scenarios with this detailed at length in the forums) and the lack of "permanent" local storage.

We chose to continue to work with EC2 and use it for a project that required a small cluster of video processing servers. Just using EC2 for these tasks, we realized that the command-line tools were not going to meet our needs (even for the short-term). Thus started the development of AWS-Console.

As we got to know EC2 (and Amazon's intentions with it) we saw a definite future for bringing web deployments to EC2 and a path to resolve the two notorious problems above. We have since migrated a number of our own products to EC2 (including AWS-Console itself) and are learning to address the needs of EC2-based deployments. Besides being a front-end to EC2/S3, AWS-Console now provides some features we find handy starting with automated database backups (very simplified currently), basic permissions and auditing (so a company EC2/S3 account can realistically be shared by a number of employees/users), and monitoring. You can see where we've been focusing at:

Our pricing is pretty reasonable too--currently it's free if you have your own EC2 account--and we welcome feature requests, pricing suggestions, and any other feedback. You can comment at our blog (Amazon.com. Spend less. Smile more.) or email us from the site.

-Jonathan

>http://info.aws-console.com/

> Whats your view about them.. I been thinking about EC2/S3 for my project > hosting but haven't yet reach the point of deployment yet. :slight_smile: As you > have done that and what you are saying here makes lot of sense to me.. I > like to hear more about the above ..Never used it. Looks like a pretty front-end to the launch tools.

AWS-Console is becoming a lot more than just a pretty front-end. Just the overview and management it provides saves a lot of time. We automate some of the tedious tasks such as bundling up an image or performing backups to S3. We're also adding monitoring and application management. We're currently able to launch some of our Rails apps fully automatically from SVN to EC2 and we're working hard to offer this as a service to everyone.

Also, AWS-Console is a Rails app itself and it is deployed on EC2. I'm currently moving it from a single server deployment to a 4-server deployment. We're making heavy use of BackgrounDRb to pool EC2 for changes.

All in all EC2 is a fantastic platform.

Regards,    Thorsten

Peter,

Thanks for your latest input! I have so many more questions...I have not tried try out aws-console yet...

We look forward to your input!

1. I am wondering the database and presistance issue that was brought up in the thread -- does aws-console solves those issue?

If you are deploying on a non-replicated database server today--Yes. We have made EC2 at least as robust as this configuration--which seems to be reasonable for most web deployments. Things get more interesting as you add multiple tiers, large databases and more servers; however, this is an interesting problem regardless of whether you are on EC2 or not. We do feel that there is a lot of value on the table to continue in the direction we've taken and find ways of generalizing these deployment patterns and dialing them into automated EC2/S3 deployments.

With one checkbox, you can enable automated database backups to S3. These will run at whatever frequency you'd like depending on your configuration. EC2 and S3 have a fast connection and there is no charge for the transfers.

2. One of the other issue was if "my app is getting lot of hits --> can aws-console start a new EC2 instance dynamically or if less load shuts down instances dynamically" . I.e. You deploy your app to "Dynamic Instance" with your operating criteria --> i.e load X do Y etc.

Without going into specifics, I can say that this is in our roadmap. I would encourage you to contact us directly to discuss if this is a scenario that you can leverage in your app.

3. You mention from SVN to EC2 is it using deprec or capistrono and you are using BackgroundDRB so does that mean I can remotely i.e via web run rake and cap tasks?

Yes on Capistrano. Yes on web-initiated deploys. We will have screencasts coming for these features in mid-February. *It's been a lot of work!*

4. The AWS-Access Key.. would it be wise to have loads of AWS-Access keys in aws-console.com which could be then a hack target? How did you solve this problem.

In the longterm, we hope Amazon provides a better interface for 3rd party management of their APIs--in the meantime, we need to have access to our user AWS credentials for a large cross-section of tasks (bundling comes immediately to mind). We store no private credentials in the db unencrypted, and are happy to answer specific questions about our security practices.

If I understand you correctly you are pursing a route of "Virtual Hosting Provider" backed by Amazon/S3

Well to rephrase in our words... We know that commodity servers and storage resonates with our sense of how things will look into the future for web deployments. We'd like to understand the upcoming technologies and their associated pluses and minuses. By living in this space we hope to find pain points and provide solutions. Today this is EC2/S3, but we can imagine a tomorrow...

an alternative to likes of Engine Yard, RailsMachine etc correct? Could you please tell me more about your differences?

I have not had enough experience with these services to try and differentiate our offer. That said, the Rails community tends to produce really great products--so I would expect solid offerings from these services.

We offer a free account on AWS-Console with a bit of free server time too. I'd encourage you to try EC2 out and see if it answers some of your questions--or share with us if you find more!

-Jonathan

As long as you are pointing and clicking and your website is the master deployment engine. I need to be able to automate calls to the system if your service is to be of any use to me. Otherwise I'd be better off concentrating on providing Capistrano tasks which achieve the same thing but which I can easily call from a program.

That's the problem with going down the website approach rather than the tool approach. Great for the pointy clickers.

tools.AWS-Console is becoming a lot more than just a pretty front-end. Just

Neil Wilson wrote: > I need to be able to automate calls to the > system if your service is to be of any use to me.

Great feedback! We've been told this by other users as well and have a beta REST interface that we can make available for you.

Peter & Neil--we really value your input here and are looking for direct feedback like you are providing. Would either of you be interested in being part of our bleeding edge user group? This means you would have access to our pre-release features (that will likely answer many of your questions and uncover many more) and you will also be invited to our discussion group where we debate our approach with other users.

-Jonathan

I'd love to be there - primarily because Google Groups appears to have failed to post two of my feeedback messages here (and it gets lost in the noise).

I have a particular requirement which you may find of interest.

NeilW