Tuesday, November 26, 2013

Synnefo v0.14.10 Released

Hello everybody,

we are pleased to announce that today we released Synnefo v0.14.10.

You can find the Debian packages on our apt repository (apt.dev.grnet.gr) under wheezy.

You can also check out the upgrade notes here:
http://www.synnefo.org/docs/synnefo/latest/upgrade/upgrade-0.14.10.html

As you may already know from a previous email on the list:
https://groups.google.com/forum/#!topic/synnefo/rjUMyZTJmWU

Synnefo v0.14.10 is the second transitional package that will help you to smoothly upgrade to Debian Wheezy. Synnefo v0.14.0 is not compatible with Ganeti 2.6 any more, so you will also need Ganeti 2.8 to proceed. You can find the corresponding Ganeti package (snf-ganeti) on our apt repository too. The version you need is:

2.8.2+snapshot1+b64v1+hotplug3+ippoolfix+rapifix+netxen+lockfix2-1~wheezy

The patches: hotplug3, ippoolfix, rapifix and netxen are already merged in the official upstream and will be part of Ganeti 2.10. So, we just backported them in this package. The patches: b64v1, lockfix2 are two small patches that fix some minor issues. The patch: snapshot1 implements some new functionality regarding snapshots which is not used currently by Synnefo v0.14.0, so the codepath is actually inactive, but is there to get stress tested in our production deployment.

So, go ahead, give them a try and please report back any problems or bugs you may find.

Enjoy,
from behalf of the Synnefo team,
Constantinos

Monday, October 7, 2013

Archipelago: officially open source

Hello everybody,

We are pleased to announce that today we are releasing our custom storage layer, Archipelago, as open source software under a 2-clause BSD license. Archipelago has been running in production for over half a year without problems, so after a substantial cleanup we decided to open it up to the public.

Archipelago is a new storage layer that decouples Volume and File operations/logic from the actual underlying storage technology, used to store data. It provides a unified way to provision, handle and present Volumes and Files independently of the storage backend. It also implements thin clones, snapshots, and deduplication, and has pluggable drivers for different backend storage technologies. It was primarily designed to solve problems that arise on large scale cloud environments. Archipelago is written in C.

Please check out the official documentation:

You can try it out for yourself, following the instructions found here:

You can also find the code here:
https://code.grnet.gr/git/archipelago

Debian packages can be found on our apt repository under unstable. Add the following to your sources to use them:

deb http://apt.dev.grnet.gr unstable/
deb-src http://apt.dev.grnet.gr unstable/

Finally, on the apt repository, we provide an Archipelago ExtStorage provider, for those that want to use Archipelago with Ganeti.


We hope Archipelago is going to prove useful to others as it did to us.

As always, for comments, questions, or bug reports feel free to contact us at:
synnefo@googlegroups.com

Enjoy,
the Synnefo team

Wednesday, October 2, 2013

Synnefo @ USENIX ;login: (Oct '13 issue)

Hello everybody,

The October issue of USENIX ;login: is out!

And it features an article we've written about Synnefo. Good news is that it is open for everybody, not only USENIX members. So, go ahead and check it out. It is entitled:

"Synnefo: A Complete Cloud Stack over Ganeti"


Enjoy,
Constantinos

Thursday, September 12, 2013

Synnefo @ Ceph Day London

Hello everybody,

The schedule for Ceph Day London has just been announced!

What's more, we will be presenting Synnefo at the event showing how we manage to unify cloud storage (Files, Images, VM disks) using

Synnefo + Ganeti + Archipelago + RADOS in production.

We are also really excited to meet the guys from Inktank and other Ceph users, learn and exchange opinions.

So, if you are in London and the above sound exciting, please join us at the event.

See you in London.

Tuesday, September 10, 2013

A great GanetiCon 2013 has ended

Hello everybody,

as you already know, the first Ganeti conference, GanetiCon 2013, took place last week in Athens. We were extremely happy to be part of it, not only during the organizational procedure, but most importantly during the conference itself, participating in the various design discussions and user reports.

It was great meeting most people from the Ganeti team and other Ganeti users from Greece and abroad. We also had double the pleasure to meet in person happy users of Synnefo, too.

The conference went smoothly from beginning to end and everybody seems to have enjoyed the event. The venue, food and schedule were great, since everyone tried their best and that really showed.

All in all it was a wonderful 3-day event.

For all of you that couldn't make it, please take a look at the updated GanetiCon site for photos and presentation slides.

Let's hope we made a good start for Ganeti this year, and more GanetiCons are ahead the next years to come.

See you all at the next GanetiCon,
Constantinos


P.S.: a special thanks to the guys from Inktank and especially Patrick for sending us all the great Ceph swag, even though they couldn't make it to the event.

Tuesday, August 27, 2013

Switch from okeanos.io to demo.synnefo.org

Hello Synnefo users,

You may have noticed that lately, we have been trying to separate the Synnefo software from the IaaS service it powers, ~okeanos. The latest step towards this direction is the rename of our demo deployment of Synnefo. The demo has been called okeanos.io for a while now and has now changed to demo.synnefo.org. Also, to ease the transition, the okeanos.io URLs now point to the synnefo.org page.

What does this change mean to you? Well, mainly two things have changed:

1) The NAT gateway of your VMs has now changed from gate.okeanos.io to gate.demo.synnefo.org. This change affects the hostname you SSH to.
2) The authorization URL that grants API access has changed to https://accounts.demo.synnefo.org/identity/v2.0/. This is especially important for all the users that have been using kamaki. These users will need to do the following:

1) Get the correct authentication URL and token from your API access page.
2) Create (or edit) an entry to your .kamakirc like the following:
[cloud "demo"]
url = https://accounts.demo.synnefo.org/identity/v2.0/
token = myT0k3n==
3) Check if you can authenticate with your credentials:
kamaki user authenticate --cloud demo
For any problems or questions, you can consult our support page.

Friday, August 23, 2013

Synnefo Live CD Released

Hello everybody,

We have some exciting news for all of you that want to try out our latest Synnefo software: οur team has managed to bundle it into something as small as a USB stick. That's right, there is now a Debian-based, Live CD edition of Synnefo which can be deployed from your server or even your home computer!

The key features of this Live CD are:
  1. It just works!
  2. The ability to create and run Virtual Machines depending either on Debian Wheezy (included), one of our pre-configured images or one of your own.
  3. Create private virtual networks and connect your Virtual Machines.
  4. Test Archipelago, our new ultra-fast volume layer.
  5. Use Pithos, our storage service based on the OpenStack Object Storage API.
Moreover, we have stayed true to the plug-and-play nature of Live CDs, meaning that you will be able to experience Synnefo straight from the Live CD, without installing it or doing any configurations to your machine.

So, grab a copy (1,3GB, .iso image) from the Synnefo page (or directly from here), boot it and experience all the latest features of Synnefo!

For those unfamiliar with the Live CD process, you can follow the instructions below:

1. Burn the image either to a DVD or USB medium. The image file is ready to be burnt in a DVD but, for those that want to write it in a USB stick, there are several ways, depending on your Operating System:

For Linux, you can use dd:
dd if=/path/to/Synnefo-LiveCD.iso of=/dev/sd* bs=4M && sync
where /path/to/ is probably your Downloads folder and /dev/sd* your unmounted USB device.

For Windows, you can use one of the programs in this list.

2. Plug your Live DVD or USB and change the boot sequence in BIOS/UEFI to primarily boot from DVD or USB.
3. Once the Live DVD or USB has booted, choose the "Default" option and in less than a minute, you will be greeted with a Firefox session that will give you a brief description of your live environment. From there on, you can freely experiment with Synnefo.

Caveats:

1. Only 64-bit CPUs that have virtual extensions are supported
2. Since this is a Live CD, the VMs' disk storage will also reside in RAM. This typically means that you can create only a small amount of VMs. VMs created with Archipelago however, are far less constricted by this.
3. The Debian image is Squeeze, so the latest Nvidia/AMD GPUs may not be supported.

Finally, note that the Live CD is still work in progress. If you find any bugs or want to leave some feedback, you can contact us at synnefo{at}googlegroups{dot}com.

Enjoy,
the Synnefo team

Tuesday, August 20, 2013

Synnefo @ CloudOpen Europe 2013

Hello everybody,

the schedule for LinuxCon/CloudOpen Europe 2013 has just been announced and we are more than pleased to be a part of it. Vangelis from our team will be giving a talk/demo about Synnefo on Tuesday.

So, join us at CloudOpen 2013, to learn more about Synnefo and see how it is used in large scale production environments.

See you all in Edinburgh,
Constantinos

Monday, July 8, 2013

Synnefo v0.14 Released

Hello Synnefo users,

we are happy to announce that Synnefo v0.14 is out!

Synnefo v0.14 was not planned to be a major-features release, but rather one concerning compatibility and uniformity issues regarding the way different components are handled and how they interact with each other. However, during the above process, it came out that to achieve what we targeted, we had to refactor some parts of the code and abstract others, so at the end some important changes actually happened.

In v0.14 we standardize on the URL patterns across all Synnefo components, we clearly distinguish components (e.g.: Astakos, Cyclades, Pithos) from services (e.g.: Keystone, Account, Compute, Image, Quota, Object Store), introduce a new, generic way to register components/services/resources, and update the APIs to be compatible with the latest corresponding OpenStack specifications.

Also, since v0.14, Synnefo is branding neutral. All '~okeanos' and 'GRNET' references have been removed and we have additionally introduced a new component: snf-branding, which enables Synnefo users to easily adapt a Synnefo installation to their company's/organization's visual identity.

Finally, we introduce a new tool called 'snf-deploy' which automatically deploys Synnefo. As a first step, we provide a new guide that explains how to use the tool to deploy the whole Synnefo stack on one node in few minutes time.

Specifically, copy/pasting from the NEWS file:

Synnefo-wide
  • Standardize URLs for Synnefo Components
    • impose structure and naming conventions to all URL related settings. Make each component deployable under a user-configurable <COMPONENT>_BASE_URL. Each API (compute, image, etc.) is deployable under a developer-configurable prefix beneath BASE_URL
  • Branding customization support across Synnefo frontend components
    • ability to adapt the Astakos, Pithos and Cyclades Web UI to a company’s visual identity This is possible using the snf-branding component, which is automatically installed on the nodes running the API servers for Astakos, Pithos and Cyclades
  • Create a JSON-exportable definition document for each Synnefo Component (Astakos, Cyclades, Pithos, etc.) that consolidates APIs (services), resources, and other standardized properties (e.g. default URL prefixes)
  • Implement common client for communication with Astakos and proper error handling
Astakos
  • Redesign of the accounting system (quotaholder) and integration into Astakos
  • Implementation of the Κeystone API call POST /tokens
    • Specified the API call allong with a procedure to register a Synnefo component (e.g. Cyclades) along with the services it provides (e.g. Compute, Image, Network) and the resources it handles (e.g. vcpu, ram)
  • Astakos specific API calls are moved under ‘/account/v1.0’
  • Support API calls for quotas, commissions and resources
  • Improve user activation process
  • Improve limit of pending applications by making it a quotable resource
  • Added fine grain user auth provider’s policies
  • Overhauling of Astakos management commands for usability and uniformity
Cyclades
  • Speed up private network creation, by creating a network on a Ganeti backend only when a server connects to that network
  • Rename management commands for commissions and resources for uniformity with other services
  • Synchronize Cyclades API with Openstack Compute v2.0 API
Pithos 
  • Various minor fixes
Tools
  • Introduce the 'snf-deploy' tool, which automatically deploys Synnefo on a number of nodes

Synnefo v0.14 doesn't support Ganeti 2.7 as originally planned, since Ganeti 2.7 was not yet officially released at the time when Synnefo v0.14 was out. Hope to have it officially supported at the next version of Synnefo.

As always, any kind of feedback is highly appreciated.

Enjoy,
the Synnefo team

Wednesday, May 22, 2013

First Google Ganeti Conference: GanetiCon 2013

[UPDATED with the new GanetiCon 2013 website links]

Hello everybody,

we are very pleased to announce that this year we are co-organizing the first Google Ganeti Conference: GanetiCon 2013.

The conference will take place between 3-5 September 2013 in Athens, Greece.

The first GanetiCon will be a developers oriented conference, and it will mostly include design talks and discussions about new features and future plans. It will also probably feature an advanced Ganeti workshop, depending on user demand. Additionally, anybody who is interested in:
  • learning how other companies/institutions use Ganeti
  • checking out how large scale Ganeti deployments look like
  • glimpsing the product roadmap of Ganeti
  • contributing to future design of Ganeti
  • obtaining help with specific Ganeti issues
should definitely attend.

Most developers of the Ganeti and Synnefo team will be there, so it will be a great chance to meet you all in person, and answer all your questions.

If this sounds exciting and you are interested in attending the first GanetiCon please fill out the registration form or the CFP. Stay tuned.

See you all in Athens!

on behalf of the Synnefo team,
Constantinos

Monday, May 13, 2013

Synnefo team @ Xen Hackathon 2013

This year's Xen Hackathon will be kindly hosted by Google and will take place at Google's Dublin offices.

Six months ago, we thought that since Synnefo uses Ganeti at the backend and Ganeti already supported Xen as the underlying hypervisor, why not try to port Synnefo to use Xen too? What would that mean and what would that take to happen?

It would mean that anybody running Ganeti with Xen, would be able to just install Synnefo on top of their current deployment, and have their own public or private cloud. Nice.

And how could we do that? Since Synnefo keeps clear separation between the cluster management layer and the cloud layer, the changes should have been minimal and only few components should get affected. Indeed, after some design discussions it came out that only two Synnefo components should be ported:
  • snf-image: Ganeti OS Definition that handles image deployment and customization
  • snf-network: component that handles the host's networking after a NIC is up
snf-image has already been ported since v0.8 and now supports Xen. snf-network is even easier to port, but to do so we need some additional Ganeti support.

To implement such support in Ganeti, we introduced the corresponding topic at this year's Xen Hackathon. Dimitris and Stratos from the Synnefo team will be attending the hachathon to handle the job :)

So, if anybody of you is willing to help, or just want to meet the guys from the team, or learn more about Synnefo, feel free to fire up a conversation at the event. They will be more than happy to chat with you. You can also reach them at:

Dimitris: dimara@grnet.gr
Stratos: psomas@grnet.gr

Tuesday, April 23, 2013

Synnefo Services and REST APIs

Today, we'll see an overview of the Synnefo Services and the RESTful APIs that enable Synnefo to talk to the outside world and vice versa.


Synnefo has three (3) basic components that provide all its Services. These components are:

  • Cyclades: Compute, Network, Image, Block Storage Service
  • Pithos: File/Object Storage Service
  • Astakos: Identity, Quotas Service

Synnefo exposes the OpenStack APIs for most of its operations. Also, extensions
have been written for advanced operations wherever needed, and minor changes
for things that were missing or change frequently. Specifically:






The following diagram shows the layered approach of Synnefo and the various APIs for each Service. The corresponding Synnefo component that implements each Service also appears in the diagram:


As shown above, the Synnefo Web UI is a standalone Javascript client that speaks the Synnefo APIs. There is also an intuitive command line client called kamaki, that speaks the APIs and can be used to access Synnefo. Furthermore, one can use the kamaki library for programmatic access.

To learn more about the APIs, please take a look at the detailed Developer's Guide. If you want to learn more about the Synnefo architecture, please take a look at the official documentation.

Thursday, April 11, 2013

Synnefo v0.13 Released

Hello Synnefo users,

we are happy to announce that Synnefo v0.13 is finally out!

During this release cycle, which took a long time indeed, we made some major changes in Synnefo. The biggest one was that we merged most of the Synnefo components in a single repository allowing for a more uniform approach and aligned versioning.

The Synnefo repository now includes the following components:
snf-common, snf-webproject, snf-astakos-app, snf-pithos-app, snf-pithos-backend, snf-cyclades-app, snf-cyclades-gtools, snf-stats-app, snf-quotaholder, snf-tools.

The snf-pithos-webclient was left out, since it is going to be rewritten and merged into an all new web UI that is coming for v0.15.

We also left out the components that are standalone and can be used independently from Synnefo. These being:
snf-image, snf-network, snf-vncauthproxy, nfdhcpd, snf-cloudcms.

Also during the merge, many components underwent heavy refactoring to allow for new features to come in this release and in the future.

Thus, a number of new features were introduced in v0.13. Copy/Pasting from the NEWS file:

 

Synnefo-wide

  • Support for pooling throughout Synnefo
    • Pooled Django DB connections, Pithos backend connections, HTTP   connections using single `objpool` package
  • Improved management commands
    • Unified codebase for output of tables in JSON, CSV
  • Bring most of Synnefo code inside a single, unified repository
    • support automatic Python and Debian package builds for individual commits
    • with automatic version generation
  • Overhauling of Synnefo settings: renames and refactoring, for increased uniformity (in progress)
  • Deployment: Standardize on gunicorn, with gevent-based workers and use of Green threads throughout Synnefo
  • Documentation: New scale-out guide, with distinct node roles, for mass Synnefo deployments

 

Astakos (Identity Management)

  • Support multiple authentication methods
    • Classic (username/password), Shibboleth, LDAP/Active Directory, Google, Twitter, LinkedIn
    • Users can enable/disable auth methods, and switch between them
  • Introduce a UUID as a global identifier for users, throughout Synnefo
    • The UUID remains constant as the user enables/disables login methods
  • Allow users to modify their email address freely
  • Per-user, per-resource accounting mechanism (quotaholder)
  • Full quota support, with per-user, per-resource quotas, based on quotaholder
  • Projects: Users can create and join Projects
    • Projects grant extra resources to their members
  • UI Enhancements for quotas and projects
    • distinct Usage tab, showing usage of individual resources
    • Project management UI
    • New Overview page

 

Cyclades (Compute)

  • Commission resources on quotaholder/Astakos
  • Support mass creation of flavors
  • Support for the ExtStorage disk template in Ganeti
  • Query and report quotas in the UI
  • Pass VM configuration parameters over a VM-side API (`vmapi`)
    • Do not pass sensitive data as Ganeti OS parameters
    • Keep sensitive data in memory caches (memcached) and never allow them to hit the disk
  • Display additional backend information in helpdesk machines list
  • Allow helpdesk users to search for an account using a known machine id
  • Helpdesk actions are now logged using the synnefo's common login infrastructure

 

Pithos (Storage)

  • Support storage of blocks on a RADOS backend, for Archipelago
  • Rewritten support for public URLs, with admin-selectable length

 

Tools

  • Extend snf-burnin to include testing of Pithos functionality



For a preview of what's coming in next major releases (we will be posting a detailed roadmap soon):

We are planning for a stable v0.14 in which we will be focusing on code cleanup, documentation and stabilazation without introduction of major features. We hope to have v0.14 work out-of-the-box with the stock Ganeti 2.7 once its officially out, so there will be no need for applying patches or custom Ganeti builds. Finally, for v0.15 we have some nice new features already on the way that we will announce in a following post.

We are looking forward to your feedback,
the Synnefo team

Friday, April 5, 2013

objpool: Introducing generic pooling in Synnefo

This is to share our code and experiences with resource pooling
in developing and deploying the Synnefo cloud software.
Scroll down to the end for links.

Share nothing: simplicity and scale


One technical aspect of the Web is that HTTP requests are independent from
each other. They enter the server, hit the backend, then come back with a
response, and nowhere do they have to deal with other requests. This is the
HTTP and REST world and this is what Web frameworks like Django strive for.

Why do we like this?

Because it makes it easy for machines to talk with each other, it simplifies
complex interactions among services, and allows them to scale better.

Do others like it?

Indeed they do. It seems that the standard practice is to not share at all.
Every request lives in its own private world. It has:
  • its own connection to the client
  • its own connection to other services
  • its own connection to the database
and if there's need for other types of backends, as we have in Synnefo, it has
its own instances of them too. Just keep them from bumping into each other.
At least long enough, until it's someone else's problem, e.g., the OS or DB's.
(Also in this spirit, Django's inflexible READ COMMITTED transaction isolation
level, but that is for another post)

Neat and easy.
When you first deploy your web project, you don't even need to worry about
these things, you are free to worry only about the hot stuff: functionality and
features! Yes, that's us, quite a few months back.

 

Cost of co-ordination

However, just conjuring a REST exterior doesn't mean we're there.
If requests have to use shared resources, then two problems occur:
  1. someone will have to program complex co-ordination so that they
    don't mess with each other's results.
  2. co-ordination means that requests will have to wait for each other.
That is, complexity rises, scalability falls.

 

Cost of constantly creating and destroying

Furthermore, other costs start to creep in as the thing scales up. In Synnefo,
a typical request will have to contact our Identity Management service
(Astakos) over SSL to check for user credentials and other info. It also needs
to connect to the database, which is handled by Django. Finally, requests that
need low-level access to our Storage Service (Pithos), e.g., to handle bootable
VM images, have to instantiate a storage backend, too.

With every request, creating and destroying each one of
  1. SSL connection
  2. DB connection
  3. Storage backend instance
is not free, even if it helps keeping requests isolated.

For SSL connections, it's not only the three-way TCP handshake that takes time,
but mainly the SSL negotiation that follows and includes much more data to be
exchanged.

Likewise, database connections are not just TCP connections. The database
server spawns a whole new process to serve each new connection, and shuts it
down afterwards. Moreover, a framework that hides SQL behind a convenient
object API (ORM) like Django does, may have to make additional queries for
discovering the schema of the database and ensuring consistency.

Our storage backend requires heavy initialization too, including a database
connection of its own.

So, how do we handle the above efficiently?

For the first problem, co-ordination, we simplify things by giving each request
its own resources, so that it doesn't have to co-ordinate with or wait others.

But creating and destroying SSL connections, DB connections and storage
backends costs heavily.

If only we could share resources but pretend we don't...

 

Pooling


That's what pooling is about.
It creates an illusion that each request is alone in a private world, when
really it just leases shared resources from a pool.

For SSL connections, urllib3 is a full-featured library with a nice interface.
You make an HTTP request and it uses a pooled httplib connection. If you make
requests to the same server, no new connections will have to be established.
Unfortunately, urllib3 is not available for Debian Squeeze, and our pooling
concerns were not limited to HTTP. However, urllib3 will be in Debian Wheezy,
so maybe we'll consider it.

For DB connections, there's been some discussion within the Django community.
The essence is, Django does not consider pooling its concern, and there are
several independent solutions. The one we considered was pgBouncer, for
PostgreSQL. pgBouncer is an intermediate server which pretends to be the
database itself. In the front, it accepts connections from clients that come
and go. In the back, it keeps a pool of connections to the real database that
are kept alive, regardless of what happens frontside. Given that TCP
connections are much lighter to create and destroy than PostgreSQL server
processes, it makes quite a difference.

Of course, for the pooling of our own storage backend instances, we couldn't
find a ready-made solution.

So, we were in the situation where we would either introduce two new components
and a custom pool for our storage backend instances, or write a simple abstract
pooling mechanism. We could then use this generic mechanism in all the above
cases simplifying things a lot and for triple the gain.

We decided to give it a shot, and go with the latter.

We created a generic object pool class, which we subclassed and customized to
create pools for httplib connections, and our own storage backends, then used
them in our code. For the Django database, we gave Django a modified psycopg2 driver 
which transparently pools connections to the server.


Pooling in production

Here's a diagram of the pooling we have currently deployed:

As you can see there are quite a few places for pooling,
and it's nice we have them all covered with minimal effort.

 

Pooling headaches/lessons


But what about specific technical challenges in pooling,
what have we learned?

 

Concurrency control

For the sake of the above, we dive below the shared-nothing surface. We now
need to make sure that getting an object in and out of the pool is done
atomically and safely, even if multiple threads hit it at the same time. Since
we deploy Synnefo with gunicorn and gevent, we don't use real threads, but
"green" ones instead, so we need to make sure it works in that environment too.
It seems that a Semaphore and a Lock from threading are enough for this.

 

Leaks

There is a hazard if you take an object out of the pool, that you may never
return it, because you forgot or because you had an unhandled exception
accident.

There are two problems with this.

First, you may end up with lots of resources allocated but unusable, or if the
objects need explicit destruction handling, it may never get done because you
just lost them out of scope.

Second, if the pool has a limited size, as it might be used to prevent
overload, then with time, everything might leak out and leave nothing in the
pool, and everyone drawing from it just gets stuck forever. In fact, we
intentionally deployed with small pool sizes of this type, so that we would
really "feel" the leaks earlier and fix them.  It worked!

Concerning leaks, it's straightforward that the pool itself must handle
everything in the correct order of statements, with the right locking, and the
required try/except/finally blocks.

On the other side, there is a subtler issue for those who use the pool. The
object they just got out of the pool isn't just another object like all the
others they are used to: they must not ever lose it, even if it's because
somebody else crashed on them! And if you have to circulate that pooled object
across software layers that might not even be aware of the pooling... Well, it's
error-prone.

To solve this problem, we opted to wrap the pooled object in aplace where the
pooling code has enough control, like a decorator or context manager, or even
carefully encapsulated within another type of object. That is what urllib3
does; it doesn't expose the pooled httplib connections, but provides an
additional API layer to work with, while taking care to do all the housekeeping
internally itself. Thinking it would be simpler, we didn't do this at first and
it hurt us. Now we do :)

 

Accidental Sharing

Another hazard is that the same object might be used concurrently at two or
more places.  This may happen because of pool concurrency failure, which has
been covered previously, or because you put it into the pool twice, or because
you never stopped using it even after you have put it back to the pool and made
it available to others.

The pool itself might protect you from putting an object in twice, but it can't
force you to stop using it after giving it back. If the real pooled object is
not given out directly, but is privately attached to another front-facing
object, You can try and "disable" the front-facing object by disconnecting the
real one from behind it so it becomes useless. Or, see previous note and avoid
exposing pooled objects altogether.

 

Dirty state

We usually don't care about the state of an object, say a connection or a file,
if we are about to discard it. Everything is wiped out and no state can cause
us worry. But sharing changes this. If I get a "used" connection and the
previous user hasn't read all its data waiting for them, then my first read
will return his data, not mine. Not good.

Therefore we included a "cleanup" step during the return of objects into the
pool. This is specific to the type of each pool as what is "clean" or not
differs. For example, if an http connection is not idle it is discarded. A
database connection will clean everything up by aborting all transactions and
releasing all resources.

 

Dead objects

An object may sit in the pool for a long time. When it finally gets out,
it might be dead. A connection might have dropped, or a backend handle
invalidated. Therefore we include a validation step during getting an object
from the pool. Objects that fail the validation are discarded and new ones
are drawn.

For example, if a connection is newly drawn from the pool but there's data
waiting to be read from it, then it can't be good. Most likely it's an EOF,
but even if it isn't we just throw it away.

Interested? Check it out!


We have released the generic pooling class as open source. It may prove
to be as useful to others as it is to us. Also, our HTTP pool is general enough
that we have included it in the package too.
Generic pooling + http subclass source:
deb http://apt2.dev.grnet.gr stable/
deb-src http://apt2.dev.grnet.gr stable/

The code is already in action in our production and in our free trial service:  https://okeanos.io/

Friday, March 15, 2013

Synnefo plugin for Thunderbird



Starting in version 13, Thunderbird added support for online storage services through Filelink. It allows you to upload attachments to an online storage service and then replaces the attachment in the message with a link. Filelink can be configured to use many of the well known storage services out there and it can now be configured to support Synnefo deployments, too.

We created the ~okeanos filelink plugin for Thunderbird, to support our ~okeanos public cloud service. It uses the Astakos API for authenticating the user and the Pithos API for uploading and publishing the attachments. It is easy to use as it requires only the authentication token from the user. Also, it can be configured to support any Synnefo installation by specifying the correct Astakos and Pithos endpoint URLs.

Feel free to clone its code and create your own Thunderbird plugin by only changing the endpoint URLs. You can then use it with your Synnefo powered cloud service.



Monday, February 11, 2013

Synnefo + RADOS = <3

We are happy to announce that Synnefo now supports completely unified storage
(Files/Images/VM disks) over commodity hardware using RADOS. And since it's passed the testing phase, it is now heading to our production environment (~okeanos). And it scales!

But what does "completely unified storage" mean and why RADOS?

Let's take things from the beginning.

Problem #1

 

Trying to scale a public cloud service over commodity hardware is not a trivial task. At first (mid 2011), we had all our VMs running over DRBD with Ganeti, and our File/Object Storage service (Pithos) backed by a big NAS. DRBD is great, production-quality software, enabling live migrations with no shared storage, and aggregate bandwidth that scales with the number of compute nodes. We wanted to keep all that. On the other hand, we knew that if we wanted the Storage service to scale, we had to get rid of the NAS eventually. We were also eager to explore new paths of having the same backing storage for VM disks and files.

An obvious choice would be to use a distributed filesystem running over commodity hardware, such as Lustre, GPFS or GlusterFS. Ganeti already supported VM disks over shared files, so the only overhead would be to write a shared file driver for our Storage service, which was trivial. However, we quickly realized that we didn't really need filesystem semantics to store VM volumes, and we could certainly avoid the burden of having to take care of a distributed filesystem. Object/block semantics was what we were looking for.

So, we decided to test RADOS, since we had already been following the progress of Ceph from mid 2010. For the VM disks part, we implemented RBD support inside Ganeti and merged it into the official upstream. Starting with version 2.6, Ganeti supports instance disks on RADOS out of the box. It uses the RBD kernel driver and the RBD tools to do that. Furthermore, we implemented a RADOS backend driver for our Storage service. We chose not to go with RadosGW since we already had in production a storage frontend that implemented the OpenStack Object Storage API and also allowed for more advanced features such as deduplication via content hashing, file sharing among users etc.

By late 2011, we had it all up and running for testing. The architecture looked like this:




The above made two things possible: a) it enabled selection of the storage type for VMs, either RBD or DRBD, as an option for Synnefo end users, and b) it enabled Synnefo administrators to choose between a RADOS cluster or NFS / NAS as a backend for the Storage service. With this approach, we continued to do live migrations of VMs with no physically shared storage, this time over RBD to any physical node and not just DRBD's secondary node. And we experimented with having RBD-based VM disks in the same RADOS storage cluster as files. So far, so good.


Problem #2

 

This seemed quite a success at the time, but still didn't allow us to do magic. And by magic we mean the type of Image handling we were envisioning. We wanted to achieve three things at the same time:

  • From the perspective of the Storage service, Images being treated as common files, with full support for remote syncing and sharing among users.
  • From the perspective of the Compute service, Images cloned and snapshotted repetitively, with zero data copying from service to service.
  • And finally, snapshots appearing as new Image files, again with zero data movement.
So, what could we do? We liked the architecture so far, with Synnefo, Ganeti and RADOS. RADOS seemed a good choice for consolidating storage of VM disks and files in a single place. We decided to design and implement a thin, distributed, custom storage layer completely independent from Synnefo, Ganeti or RADOS that could be plugged in the middle and do the job. If this worked, we could get to keep the goodies of all three base technologies, and work with well-defined abstractions among them. And that's what we did.

By mid 2012 we had the prototype ready. We called it Archipelago. Then, we needed to integrate it with the rest of the infrastructure and start stress-testing. The integration happened in two directions: with Ganeti, on one side, and with RADOS, on the other.

To integrate with Ganeti, we stopped using Ganeti's native RBD support to talk to RADOS, since we now had Archipelago in between. We exploited Ganeti's ExtStorage Interface and wrote the corresponding Archipelago driver (ExtStorage provider) for Ganeti.

To integrate with RADOS, we implemented a RADOS driver for Archipelago. Finally, by late 2012 we had the code completed and have been testing it ever since. The architecture now looks like this:






After 3 months of stress testing, we are now in the process of moving everything into ~okeanos production, which is running more than 2700 active VMs for more than 1900 users, at the time of this writing.

For the impatient, the new feature is already up and running on our free trial demo infrastructure at:

http://www.okeanos.io

So, create a free trial account and choose the "Archipelago" disk option when creating your new VM. You will see it coming up in seconds, thinly!
Backed by RADOS :)

Enjoy!

Wednesday, January 23, 2013

Synnefo at FOSDEM 2013

FOSDEM 2013 is getting really close and we have two reasons to be happy
about it. First of all because we are going to spend a very interesting
and fun weekend by just attending, and second because we will be
introducing Synnefo to the rest of the community!

Vangelis will be having a talk at the cloud devroom introducing Synnefo
to the masses, explaining its design and architecture, plus referring to
the history of its last 2 years of development, and how we used it to
provide a large scale production quality cloud service. He will also
point out how Synnefo differs from the other famous cloud platforms.

So, if you are also attending FOSDEM and would like to know more about
Synnefo, feel free to join us at the talk. We will be glad to meet you.

See you at FOSDEM.

Friday, January 18, 2013

Synnefo is here!

Hello everybody,

this is the first post on the official Synnefo blog, so let me make a
brief introduction.

On this blog we will be posting all kinds of things concerning Synnefo,
from technical stuff, to practices running a large scale production
service, user stories and many more.

For those who are not familiar with Synnefo, please visit the official
page:

http://www.synnefo.org

Here is a short description and a bit of history:

Synnefo is production-quality open source cloud software. It came out of
GRNET's need to provide a full-fledged Amazon-like cloud service that
would be very simple to use, even for completely inexperienced users.
The software has been designed for large scale installations and targets
commodity hardware.

A small group of engineers began its development in late 2010.
It has been powering GRNET's public cloud service since late 2011:

http://www.okeanos.io

At the Synnefo site, we have two mailing lists:

 * For users: synnefo@googlegroups.com
 * For developers: synnefo-devel@googlegroups.com

"Synnefo" is the Greek word for "Cloud".
Please feel free to use it and love it.