Emacs and nosetests

Sometime you just need a long trans atlantic flight and a stupidly long stop-over in a random city to do some of those task that can improve your day to day but you never take some time to do it.

When using emacs I wanted a simple way to launch a nosetests on the current function my cursor is in Emacs. The syntax on nosetests is a bit tricky and I actually always have to look at my shell history to know the proper syntax (nosetests directory/filename.py:Class.function).

I created a simple wrapper for emacs for  that which allow to just hit a key to copy the nosetests command to feed to your shell or to use it for the compile buffer.

It’s available from here :

https://github.com/chmouel/emacs-config/blob/master/modes/nosetests.el

I have binded those keys for my python mode hook :

(local-set-key (kbd "C-S-t") 'nosetests-copy-shell-comand)
(local-set-key (kbd "C-S-r") 'nosetests-compile)

Happy TDD!!!!

UPDATE: There was an another nose mode already that does much more available here : https://bitbucket.org/durin42/nosemacs/

Using python-novaclient against Rackspace Cloud next generation (powered by OpenStack)

With the modular auth plugin system merged into python-novaclient it is now very easy to use nova CLI against the Rackspace Public Cloud powered by OpenStack.

we even have a metapackage that would install all the needed bits. This should be easy as doing this :

pip install rackspace-novaclient

and all dependencies and extensions will be installed. To actually use the CLI you just need to specify the right arguments (or via env variable see nova –help) like this :

nova –os_auth_system rackspace –os_username $USER –os_tenant_name $USER –os_password $KEY

on RAX cloud, usually the username is the tenant name so this should match.

For the UK Cloud you just need to change the auth_system to rackspace_uk like this :

nova –os_auth_system rackspace_uk –os_username $USER –os_tenant_name $USER –os_password $KEY

swift.common.client library and swift CLI has moved to its own project

Historically if you wanted to write software in python against OpenStack swift, people would have use the python-cloudfiles library or swift.common.client shipped with Swift.

python-cloudfiles was made mostly for Rackspace CloudFiles before even Swift existed and does a lot of extra stuff not needed for OpenStack Swift (i.e: CDN).

swift.common.client was designed for OpenStack Swift from the ground up but is included with Swift which made people having to download the full Swift repository if they wanted to use or tests against it. (i.e: OpenStack glance).

As yesterday we have now removed swift.common.client wth the bin/swift CLI and moved it to its own repository available here :

https://github.com/openstack/python-swiftclient

This should be compatible with swift.common.client with only difference being to import swiftclient instead of importing swift.common.client

At this time we are using the same launchpad project as swift so feel free to send bugs/feature request under the swift  project in launchpad :

https://bugs.launchpad.net/swift/+filebug

and add the tag python-swiftclient there.

S3 emulation to OpenStack Swift has moved

A little note about swift3 the S3 emulation layer to OpenStack Swift

As from this review we have removed it from Swift since the decision[1] was made that only the official OpenStack API would be supported in Swift. The development will be continued in fujita’s repository on github at this URL :

https://github.com/fujita/swift3

Feel free to grab the middle-ware or report issue from fujita’s repository.

[1] Globally for OpenStack not just for Swift.

Swift integration with other OpenStack components in Essex.

During the development for OpenStack Essex a lot of work has been done to make Swift working well with the other OpenStack components, this is a list of the work that has been done.

MIDDLEWARE

To make Swift behaving well in the ‘stack’ we had to get a rock solid keystone middleware and make sure most of the features provided by Swift would be supported by the middleware.

The middleware is currently located in the keystone essex repository and was entirely rewritten from the Diablo release to allow support these Swift features :

  • ACL via keystone roles :

Allow you to map keystone roles as ACL, for example to allow a user with the keystone role ‘Reader’ to read a container the user in swift_operator_role can set this ACL :

-r:Reader container

  • Anonymous access via ACL referrer.

If a swift_operator wants to give anonymous access to a container in reading they can set this ACL :

-r:*

It basically mean you are enabling public access to the container.

  • Container syncing :

This allow to have two different container in sync, see the documentation here.

  • Different reseller prefix :

You will be able to mix different auth server on your Swift cluster, like swauth and keystone.

  • Special reseller admin account :

This is a special account whose allowed to access all account. It i used by nova for example to upload images to different accounts.

  • S3 emulation :

Allows you to connect with S3 API to Swift using swift3 and new s3_token middleware. The S3 token will simply take a S3 token to validate it in keystone and get the proper tenant/user information to Swift.

One thing missing in the middleware is to allow auth overriding, basically it means that when an another middleware wants to take care of the authentication for some request the auth middleware will just let it go and allow the request to continue. Such feature is used for example in the temp_url middleware to allow temporary access/upload to an object. This is projected to be supported in the future.

An important thing to keep in mind when you configure your roles is to have a user in a tenant (or account like called in Swift world) acting as an operator. This is controlled by the setting :

swift_operator_roles

and by default have the roles swiftoperator and admin. A user needs to have this role to be able to do something in a tenant.

GLANCE

Glance has been updated as well to be able to store images in swift which have a auth server using the 2.0 identity auth.

NOVA

Nova have the ability to access an objectstore to store images in a store which has been uploaded with the euca-upload-bundle command. Historically nova have shipped with a service called nova-objectstore but the service was buggy and had some security issues. Swift combined with keystone’s s3_token and swift3 middleware now can act as a more reliable and secure objectstore for Nova.

DEVSTACK

support Swift if you add the swift service to the ENABLED_SERVICE variable in your localrc. This is where you want to poke around to see how the configuration is made to have everything playing well together. The only bit that didn’t made for the devstack essex release is to have glance storing images directly in Swift.

CLI / Client Library

Swift CLI and client library (called swift.common.client) has been updated to support auth v2.0 the CLI support now the common OpenStack CLI arguments and environment to operate against auth server that has 2.0 identity auth.

We unfortunately were not in time to add the support for OS_AUTH_TENANT and use the Swift auth v1 syntax where if the user has the form of tenant:user OS_AUTH_TENANT will become tenant and OS_AUTH_USER the user.

Aside of a couple of bit missing we believe Swift should be rock solid to use with your other OpenStack components. There is no excuse to not use Swift as your central object storage component in OpenStack ;-).

 


How does a PUT to a swift object server look like.

I have been trying lately to get a better understanding of the Swift code base, and I found the best way to know it was to read it from top to bottom and document it along the way. Here is some of my notes, hopefully more will come.

I am starting with an object PUT when the request is coming from the proxy server. The request in the log-file will look like this :

“PUT /sdb1/2/AUTH_dcbeb7f1271d4374b951954a4f1be15f/foo/file.txt” 201 – “-” “txdw08eca2842e344bb8e11b5869c81cb52” “-” 0.0308

The WSGI controller send the request to the method swift.obj.server.ObjectController->PUT and start to do the following :

  • splits the request.path to :

device(sdb1), partition(2), account(AUTH_ACCOUNT_ID), container(foo), obj(file.txt)

  • Make sure that partition is mounted. (there is a mount_check option that can toggle this).
  • Ensure that there is a X-Timestamp header which should be set by the proxy server.
  • Start the check method check_object_creation which does the following :
  • Make sure the content_length is not greater than the MAX_FILE_SIZE.
  • Make sure there is a content_length header (except if the transfer has been chunked).
  • Make sure that there is no content_length (ie: zero byte body) when doing a X-Copy-From.
  • Make sure the object_name is not greater than MAX_OBJECT_NAME_LENGTH (1024 bytes by default).
  • Making sure we have a Content-Type in the headers passed (this could be set by the user or auto-guessed via mimetypes.guess_type on the proxy server).
  • When we have an header of x-object-manifest (for large files support) it makes sure the value is a container/object style and not contain chars like ? & / in the referenced objects names.
  • Checks metadata, make sure at first that the metadata name are not empty.
  • The metadata name length are not greater than MAX_META_NAME_LENGTH (default: 128).
  • The metadata value is not greater than MAX_META_VALUE_LENGTH (default: 256).
  • We don’t have a greater amount of metadatas than MAX_META_COUNT (default: 90).
  • The size of the headers combined (name+value) is not over MAX_META_OVERALL_SIZE (default: 4096).
  • If we have ‘X-Delete-At‘ (for the object expiration feature) we are making sure this is not happening in the past or we will exit with an HTTPBadRequest.
  • The class swift.obj.server.DiskFile will be the class that takes care to actually write the file locally. It gets instantiated and do the following in the constructor method:
  • It will hash the following  value (account, container, obj) which will become hashed for our example into :

46acec4563797178df9ec79b28146fe1

  • It will get the path where this is going to be store which going to be :

/srv/node/sdb1/objects/2/fe1/46acec4563797178df9ec79b28146fe1

  • /srv/node is the devices path which is the configuration directive [proxy]->devices (default to /srv/node).
  • sdb1 being the mounted device name.
  • add the datadir type, ”objects” for us.
  • and the partition power (2)
  • last three chars of the hashed name (fe1)
  • the hash itself 46acec4563797178df9ec79b28146fe1
  • It will get the temporary directory which become in our case to: /srv/node/sdb1/tmp it is basically the devices dir, the device and /tmp
  • If the directory didn’t exists before then it just return.
  • If the directory was existing (already uploaded) then it will parse all files in there and would looks if we have :
  • Files ending up with .ts  which will be the tombstone (a deleted file).  NB: Replication process will take care to os.unlink() the file properly later.
  • In case of a POST and if we have fast post setting enabled (see config object_post_as_copy in proxy_server) we will detect it and only do a copy of metadata.
  • It calculates the expiration time which is from now + the max_upload_time setting.
  • It start the etag hashing to gradually calculate the md5 of the object.
  •  Using the method mkstemp of DiskFile it will start to write to tmpdir, which does the creation of the file like that :
  • Make sure to create the tmpdir.
  • make a secure temporary file (using mkstemp(3)) and yield the file descriptor back to PUT.
  • If there is a content-length in the headers (assigned by the client) it will use the posix function fallocate(2) to pre-allocate that disk space to the file descriptor.
  • It will then iterate over chunk of data size defined by the configuration variable network_chunk_size (default: 64m) reading that chunk from the request wsgi.input :
  • It will update the upload_size value.
  • It will make sure we are not going over our upload expiration time (or get back HTTPRequestTimeout HTTP Error).
  • It will update the calculated md5 with that chunk.
  • It will write the chunk using python os.write
  • For large file sync which is over the configuration variable bytes_per_sync it will do a fdatasync(2) and drop the kernel buffer caches (so we are not filling up too much the kernel memory).
  • if we have a content-length in the client headers that doesn’t match the calculated upload_size we return a 499 Client Disconnected as it means we had a problem somewhere during the upload.
  • It will bail out if we have a etag in the client headers that doesn’t match the calculated etag.

And now we are starting defining our metadatas that we are going to store with the file  :

metadata = {
  ‘X-Timestamp’: timestamp generated from the proxy_server.
  ‘Content-Type‘: defined by the user or ‘guessed’ by the proxy server
  ‘ETag‘: calculated value from the request.
  ‘ContentLength‘: an fstat(2) on the file to get the proper value of what is stored on the disk.
}

  • It will add to the metadata every headers starting by ‘x-object-meta-‘.
  • It will add to the metadata the allowed headers to be stored which is defined in the config variable allowed_headers (default: allowed_headers = Content-Disposition, Content-Encoding, X-Delete-At, X-Object-Manifest).
  • It will write the file using the put method of the DiskFile class, which finalise the writing on the file on disk and renames it from the temp file to the real location:
  • It will write the metadata using the xattr(1) feature which is stored directly with the file.
  • If there is a Content-Length with the metada it will drop the kernel cache of that metadata length.
  • It will invalidate the hashes of the datadir directory using the function swift.obj.replicator.invalidate_hashes
  • It will set the hash of the dir as None, which would hint the replication process to have something to do with that dir (and that hash will be generated).
  • This file is stored by partition as python pickle which is in our case: /srv/node/sdb1/objects/2/hashes.pkl
  • Move the file from the tmp dir to go to the datadir.
  • It will use the method unlinkold from DiskFile to remove any older versions of the object file which is any files that has older timestamp.
  • It will start construct the request to make to a containers by going passing the following:
  • account, container, obj as request path.
  • the original headers.
  • the headers Content-Length, Content-Type, X-Timestamp, Etag, X-trans-ID.
  • It will get the headers X-Container-{Host,Partition,Device} from the original headers which is defined by the proxy to know on which container server it going to update. Every different PUT will have assigned a different container to each their own.
  • It will use the async_update method (by self since it’s part of the same class) to make an asynchronous request:
  • Passing the aforementioned build headers and req.path.
  •  If the request success (between 200 to 300) it will return to the main (PUT) method.
  •  the request didn’t succeed it will create a async_pending file locally in the tmp dir which is going to be picked-up by the replication process to update the container listing when the container is not too busy.
  • When finish it will respond by a HTTPCreated

Audit a swift cluster

Swift integrity tools.

There is quite a bit of tools shipped with Swift to ensure you have the right object on your cluster.

At first there is the basic :

swift-object-info

It will take a swift object stored on the filesystem and print some infos about it, like this :

swift@storage01:0/016/0b221bab535ac1b8f0d91e394f225016$ swift-object-info 1327991417.01411.data
Path: /AUTH_root/foobar/file.txt
Account: AUTH_root
Container: foobar
Object: file.txt
Object hash: 0b221bab535ac1b8f0d91e394f225016
Ring locations:
192.168.254.12:6000 – /srv/node/sdb1/objects/0/016/0b221bab535ac1b8f0d91e394f225016/1327991417.01411.data
Content-Type: text/plain
Timestamp: 2012-01-31 06:30:17.014110 (1327991417.01411)
ETag: 053a0f8516a5023b9af76c49ca917d3e (valid)
Content-Length: 24 (valid)
User Metadata: {‘X-Object-Meta-Mtime’: ‘1327968327.21’}

PS: If you don’t know where is your object on which node, you can you use swift-get-nodes

For auditing, the Etag value is important because swift-object-info will compare the object recorded etag in the metadata with what we have on the disks. Let’s try to see if that works :

swift@storage01:0/016/0b221bab535ac1b8f0d91e394f225016$ cp 1327991417.01411.data /tmp
swift@storage01:0/016/0b221bab535ac1b8f0d91e394f225016$ echo “foo” >> 1327991417.01411.data
swift@storage01:0/016/0b221bab535ac1b8f0d91e394f225016$ swift-object-info 1327991417.01411.data|grep ‘^Etag’
Etag: 053a0f8516a5023b9af76c49ca917d3e doesn’t match file hash of 9ff871e5ce5dcb5d3f2680a80a88ff38!

swift-object-info has detected that this file is not the one we have uploaded.

There is an other tool called swift-drive-audit which as explained in the admin guide will parse the /var/log/kern.log and have predefined regexp  to detect disk failure notified by the kernel. It is usually run periodically by cron and there is a config file for it called /etc/swift/drive-audit.conf. If the script find any errors for a certain drive it will unmount it and comment it in /etc/fstab(5). Afterwards  the replication process will pick it up from other replicas and put the object on that drive in handover.

Swift provide as well different type of auditor daemons for account/container/object :

  •  swift-account-auditor
  •  swift-container-auditor
  •  swift-object-auditor

swift-account-auditor will open all sqlite db of an account server and launch a SQL query to make sure all the dbs are valid.
swift-container-auditor will do the same but for containers.
swift-object-auditor will open all object of an object server and make sure of :

  • Metadata are correct.
  • We have the proper size.
  • We have the proper MD5.

Those auditors needs to be set in each type-server.conf, for example for account server you will add something like this to /etc/swift/account-server.conf :

[account-auditor]
# You can override the default log routing for this app here (don’t use set!):
# log_name = account-auditor
# log_facility = LOG_LOCAL0
# log_level = INFO
# Will audit, at most, 1 account per device per interval
interval = 1800
# log_facility = LOG_LOCAL0
# log_level = INFO

For container this is about the same options but for object-server does are the options :

[object-auditor]
# You can override the default log routing for this app here (don’t use set!):
# log_name = object-auditor
# log_facility = LOG_LOCAL0
# log_level = INFO
# files_per_second = 20
# bytes_per_second = 10000000
# log_time = 3600
# zero_byte_files_per_second = 50

Another tool shipped with swift is swift-account-audit which will audit a full account and report if there is missing replicas or incorrect object in that account.

Swift and Keystone middleware

[NB: Much things has changed since I have written this article but keeping it here for info]

It seems that integrating Swift and Keystone together present some challenges to people and this is absolutely normal as there is a lot of changes going on. This is my attempt to document how everything is plugged together.

I am not going to explain how a middleware is supposed to work as this is nicely documented on Wikipedia :

http://en.wikipedia.org/wiki/Middleware

or how the auth middlewares works on Swift :

http://swift.openstack.org/development_auth.html

or even how this is plugged inside Keystone :

http://keystone.openstack.org/middleware_architecture.html

At first let’s get some of the wordings right :

  • A tenant in keystone is an account in swift.
  • A user in keystone is also a user in swift.
  • A role in keystone is a group in swift.

Now that you keep this in mind let’s walk-though how a request will
look like.

At first your user connect to keystone and says this is my username for this
tenant and here is the secret/api key, give me the endpoints for the
services and add a token to it. This will look like this in curl :

curl -s -d '{"auth": {"tenantName": "demo", "passwordCredentials": {"username": "demo", "password": "password"}}}' -H 'Content-type: application/json' http://localhost:5000/v2.0/tokens

If successfully authenticated you get back in Json those public/internal urls
for swift so you are able to connect, here is some part of the replied request :

{
    "endpoints": [
        {
            "adminURL": "http://localhost:8080/",
            "internalURL": "http://localhost:8080/v1/AUTH_2",
            "publicURL": "http://localhost:8080/v1/AUTH_2",
            "region": "RegionOne"
        }
    ],
    "name": "swift",
    "type": "object-store"
}
[...]
"token": {
    "expires": "2011-11-24T12:35:56",
    "id": "ea29dae7-4c54-4e80-98e1-9f886acb389a",
    "tenant": {
        "id": "2",
        "name": "demo"
    }
},

So now the clients is going to get the publicURL (or can be internal) with the token and able to give request to swift with it. Let’s take the simple request which list the container,  this is a basic GET on the account :

curl -v -H 'X-Auth-Token: ea29dae7-4c54-4e80-98e1-9f886acb389a' http://localhost:8080/v1/AUTH_2

which should come back by a 20* http code if that work.

What’s happening here is that when you connect to swift it will pass it to the middleware to make sure we are able to have access with that token.

The middleware will take that token connect to keystone admin url with the admin token and pass that user token to be validated. The query looks like this in curl :

curl -H 'X-Auth-Token: 7XX' http://localhost:35357/v2.0/tokens/ea29dae7-4c54-4e80-98e1-9f886acb389a

note: localhost:35357 is the keystone admin url and 7XX is the admin token set in the configuration of the middleware.

if successful keystone will come back with a reply that look like this :

{
    "access": {
        "token": {
            "expires": "2011-11-24T12:35:56",
            "id": "ea29dae7-4c54-4e80-98e1-9f886acb389a",
            "tenant": {
                "id": "2",
                "name": "demo"
            }
        },
        "user": {
            "id": "2",
            "roles": [
                {
                    "id": "2",
                    "name": "Member",
                    "tenantId": "2"
                },
                {
                    "id": "5",
                    "name": "SwiftOperator",
                    "tenantId": "2"
                },
            ],
            "username": "demo"
        }
    }
}

Let’s step back before more Curl command and understand a thing about Swift, a user of an account in Swift by default don’t have any rights at all but there is one user in that account  whose able to give ACL on containers for other users. In swift keystone middleware we call it an Operator.

The way the middleware knows which user is able to be admin on an account is by using the roles matching to whatever configuration we have on the middleware setting called :

keystone_swift_operator_roles = Admin, SwiftOperator

since this user is part the SwiftOperator then it has access and he’s allowed to do whatever he wants for that account like creating containers or giving ACL to other users.

So let’s say we have a user called demo2 which is part of the demo account and have only the role Member to it and not SwiftOperator by default as we say before he will not be able to do much.

But if demo user give access to the group/role Memeber to a container via acl then demo2 will be able to do stuff on it.

We can all have fun with bunch of curl commands but since swift 1.4.7 the swift CLI tool have support for the auth server version 2.0 and allow you to connect to keystone for auth so we are going to use that instead.

Let first create a testcontainer and upload a file into it with our ‘operator’ user :

swift --auth-version 2 -A http://localhost:5000/v2.0 -U demo:demo -K password post testcontainer

now let’s give access to the Member group for that container on reading :

swift --auth-version 2 -A http://localhost:5000/v2.0 -U demo:demo -K password post testcontainer -r Member

and now if we try to read that file directly with demo2 it will be allowed :

swift --auth-version 2 -A http://localhost:5000/v2.0 -U demo:demo2 -K password download testcontainer etc/issue -o-

Hope this make things a bit more clears how everything works, in the next part I am going to explain how the config files and packages will look like for installing keystone and swift.