Tuesday, July 10, 2018

Playing with github pages and Jekyll blogs

Here is my first experiment: https://akolb1.github.io

Anatomy of a Cinder driver

In this post we'll discuss how to implement a full NFS-based Cinder driver for world-famous ACME NFS appliance. This driver is different from many other similar drivers in that it maintains a separate NFS share per each volume. Most (if not all) other NFS drivers keep all volumes mounted under a fixed shares.

We will assume that our appliance can create share snapshots and clones. The driver fully supports snapshots and clones.

Since the driver can manage many volumes, it doesn't attempt to mount all of them on the cinder host. Instead a volume is mounted only when it is needed and then immediately unmounted.Note that it is a major difference from other NFS-based drivers.

Note that Cinder sometimes mounts volumes by itself (for example, when it creates a volume from an image). It does such mount using remotefs brick. To avoid dangling mounts we do the following:
  • Use the same remotefs brick for our mounts to use the same mountpoints. As a result, if a volume is already mounted by a driver, it wouldn't be mounted again by a remotefs brick.
  • We explicitly mount volume in create_export() and unmount it in remove_export(). As a result, Cinder will skip already mounted volume and it will be properly unmounted when remove_export is called.
  • We explicitly unmount a volume on delete. This is safe since unmounting a volume that isn't mounted is fine.
For each volume we keep two things in the volume object:
  •  'id' is the acme share ID corresponding to the volume
  • 'provider_location' is the share NFS export in a format acceptable for the mount.nfs command, e.g. 1.2.3.4:/foo/bar
So let's look at the actual code.

We'll need a bunch of imports.


import re
from os import path, rename
import errno

# remotefs location changed between releases

try:
    from os_brick.remotefs import remotefs as remotefs_brick
except Exception:
    from cinder.brick.remotefs import remotefs as remotefs_brick

from oslo_concurrency import processutils as putils
from oslo_config import cfg
from oslo_log import log as logging
from oslo_utils import units

from cinder import exception, utils, context
from cinder.i18n import _LI, _LW, _
from cinder.image import image_utils
from cinder.volume import driver
from cinder.volume import utils as volume_utils

VERSION = '1.0.0'
LOG = logging.getLogger(__name__)
Now let's define some configuration options.

acme_OPTS = [
    cfg.StrOpt('nas_user',
               default='',
               help='User name for NAS administration'),
    cfg.StrOpt('nas_password',
               default='',
               help='User password for NAS administration'),
    cfg.BoolOpt('nfs_sparsed_volumes',
                default=True,
                help=('Create volumes as sparsed files which take no space.'
                      'If set to False volume is created as regular file.'
                      'In such case volume creation takes a lot of time.')),
    cfg.StrOpt('nfs_mount_point_base',
               default='$state_path/mnt',
               help='Base dir containing mount points for NFS shares.'),
    cfg.StrOpt('nfs_mount_options',
               help=('Mount options passed to the NFS client. See section '
                     'of the NFS man page for details.')),
    cfg.IntOpt('nfs_mount_attempts',
               default=3,
               help=('The number of attempts to mount NFS shares before '
                     'raising an error.  At least one attempt will be '
                     'made to mount an NFS share, regardless of the '
                     'value specified.')),
    cfg.StrOpt('nas_secure_file_operations',
               default='auto',
               help=('Allow network-attached storage systems to operate in a '
                     'secure environment where root level access is not '
                     'permitted. If set to False, access is as the root user '
                     'and insecure. If set to True, access is not as root. '
                     'If set to auto, a check is done to determine if this is '
                     'a new installation: True is used if so, otherwise '
                     'False. Default is auto.')),
    cfg.StrOpt('nas_secure_file_permissions',
               default='auto',
               help=('Set more secure file permissions on network-attached '
                     'storage volume files to restrict broad other/world '
                     'access. If set to False, volumes are created with open '
                     'permissions. If set to True, volumes are created with '
                     'permissions for the cinder user and group (660). If '
                     'set to auto, a check is done to determine if '
                     'this is a new installation: True is used if so, '
                     'otherwise False. Default is auto.')),

CONF = cfg.CONF
CONF.register_opts(acme_OPTS)

Our class will inherit from a few other Cinder base classes.
Note that we are not using any of the existing NFS drivers.

class acmeNfsDriver(driver.ExtendVD, driver.LocalVD, driver.TransferVD,
                     driver.BaseVD):
    """
    acme NFS Driver
    """

We will use a few global variables:



    volume_backend_name = 'acme_NFS'
    protocol = driver_prefix = driver_volume_type = 'nfs'
    VERSION = VERSION
    VOLUME_FILE_NAME = 'volume'
    __VENDOR = 'ACME STORAGE, Inc'
    __VOLUME_PREFIX = 'cinder_'
    __NFS_OPTIONS = 'v3'

The init() method is pretty simple:

def __init__(self, execute=putils.execute, *args, **kwargs):
    self.__root_helper = utils.get_root_helper()
    self.__execute_as_root = True

    super(acmeNfsDriver, self).__init__(execute, *args, **kwargs)

    if self.configuration:
        self.configuration.append_config_values(acme_OPTS)

    self.base = path.realpath(getattr(self.configuration,
                                      'nfs_mount_point_base',
                                      CONF.nfs_mount_point_base))

    self._sparse_copy_volume_data = True
    self.reserved_percentage = self.configuration.reserved_percentage
    self.max_over_subscription_ratio = (
        self.configuration.max_over_subscription_ratio)

    self.shares = {}

    self.__mount_paths = {}

We need to define do_setup method which wouldn't be doing anything useful here.

def do_setup(self, ctx):
    """acme driver initialization"""
    pass

We can also check for any setup errors which we skip for simplicity:

def check_for_setup_error(self):
    """Check for any setup errors
    This method is called immediately after do_setup()    """    # Perform any necessary checks
    pass

Drivers usually define local_path() but it isn't clear who is actually using it.
def local_path(self, volume):
    """    Return path to the volume on a local file system    (from LocalVD)    NOTE: It isn't clear who actually uses it.
    :param volume: Volume reference    :type volume: cinder.objects.volume.Volume    :returns: path to the volume    """    return path.join(self.__get_mount_point(volume), volume['name'])

Creating and destroying the volume

Now we come to the important part - creating the volume.

def create_volume(self, volume):
    """Create volume
    When the volume is created, it has the following attributes:
    - proider_location: host:/path used to mount the share
    - provider_id: acme share UUID for the volume
    Every Cinder NFS volume is backed by its own acme NFS Share. The    share has only the volume file in it.
    :param volume: volume object    :type volume: cinder.objects.volume.Volume    :returns: volumeDict -- dictionary of volume attributes    """    LOG.debug(_LI('acme: Creating volume %s'), volume.id)

    # Create acme share for this volume    share_name = self._get_volume_name(volume)
    uri = volume['provider_location']
    share_uuid = None
    # Create backing acme share    share_uuid, uri = create_acme_share(share_name)
    if not uri:
        raise exception.VolumeBackendAPIException(
            data='Missing share URI information')

    # Store share UUID in provider_id field    addr, share_path = self._parse_uri(uri[0])

    volume['provider_location'] = addr + ":" + share_path
    volume['provider_id'] = share_uuid

    # Create backing file    try:
        self._mount(volume)
        volume_path = self.local_path(volume)
        self._create_sparse_file(volume_path, volume['size'])
    except Exception as me:
        LOG.warning(_LW('acme: failed create volume: %s'), me)
        # Failed to mount, try to delete a share        try:
            pool_ops.delete_share(share_uuid)
        except Exception as e:
            LOG.warning(_LW('acme: failed to destroy share: %s'), e)
        raise
    try:
        self._set_rw_permissions(volume_path)
    except Exception as e:
        LOG.warning(_LW('acme: failed set permissions: %s'), e)

    # Unmount volume - we don't need it any more    try:
        self._unmount(volume)
    except Exception as e:
        LOG.warning(_LW('acme: failed to unmount volume: %s'), e)

    LOG.debug(_LI('acme: Created volume %s'), volume.id)

    return {'provider_location': volume['provider_location'],
            'provider_id': share_uuid,
            }

And the counterpart - deleting the volume:

def delete_volume(self, volume):
    """Delete Cinder volume
    If volume has snapshots we attemot to delete them as well.
    :param volume: volume to delete    :type volume: cinder.objects.volume.Volume    """
    # Unmount the volume if it is mounted for some reason    # noinspection PyBroadException    LOG.debug(_LI('acme: delete_volue(%s)'), volume.id)

    # Unmount the volume if someone left it in the mounted state    try:
        self._unmount(volume)
    except Exception as e:
        LOG.warning(_LW('acme: failed to unmount volume %s: %s'),
                    volume.id, e)

    share_uuid = volume['provider_id']
    # Are there any snapshots? If there are, attempt to delete them    try:
        snapshots = acme_list_share_snapshots(share_uuid)
    except Exception as e:
        LOG.warning(_LW('acme: failed to get snapshot list for %s: %s'),
                    share_uuid, e)
    else:
        # Attempt to delete snapshots. If this fails we can't destroy a        # volume        for s in snapshots:
            try:
                LOG.debug(_LI('acme: delete snapshot %s for share %s'),
                          s, share_uuid)
                acme_delete_snapshot(share_uuid, s)
            except Exception as e:
                LOG.warning(_LW('acme: failed to delete snapshot %s'),
                            e)
                raise exception.VolumeBackendAPIException(data=str(e))

    # Attempt to delete share from the acme appliance    try:
        acme_delete_share(share_uuid)
    except Exception as e:
        LOG.warning(_LW('acme: failed to delete share: %s'), e)

    LOG.debug(_LI('acme: Deleted volume %s'), volume.id)

Connecting to Nova instance

The following code is called when the driver is attached to a Nova instance:

def initialize_connection(self, volume, connector, initiator_data=None):
    """    Allow connection to connector and return connection info.    This method is called when a volume is attached to a Nova instance or    when a volume is used as a source for another volume.
    :param volume: volume reference    :param connector: connector reference    """
    data = {'export': volume['provider_location'], 'name': volume['name'],
            'options': '-o v3',
            }
    return {'driver_volume_type': self.driver_volume_type, 'data': data,
            'mount_point_base': self._base(volume),
            }

Snapshots and clones

Creating a snapshot

def create_snapshot(self, snapshot):
    """    Create snapshot
    We store snapshot UUID in the provider_id field
    :param snapshot: Snapshot    """    LOG.debug(_LI('acme: create_snapshot(%s)'), snapshot)
    volume = self.__get_snapshot_volume(snapshot)
    try:
        uuid = acme_create_share_snapshot(volume['provider_id'],
                                              snapshot['name'])
    except Exception as e:
        LOG.warning(_LW('acme: got exception %s'), e)
        raise    else:
        snapshot['provider_id'] = uuid
        LOG.debug(_LI('acme: created_snapshot %s'), uuid)
        return {'provider_id': uuid}

Destroying a snapshot

def delete_snapshot(self, snapshot):
    """    Delete snapshot    :param snapshot: Snapshot    """    LOG.debug(_LI('acme: delete_snapshot(%s)'), snapshot)
    volume = self.__get_snapshot_volume(snapshot)
    share_uuid = volume['provider_id']
    uuid = snapshot['provider_id']
    try:
        acme_delete_snapshot(share_uuid, uuid)
    except Exception as e:
        LOG.warning(_LW('acme: got exception %s'), e)
        raise    else:
        LOG.debug(_LI('acme: deleted_snapshot(%s)'), uuid)

Cloning a snapshot into a volume

def create_volume_from_snapshot(self, volume, snapshot):
    """Create new volume from other's snapshot on appliance.
    :param volume: reference of volume to be created    :param snapshot: reference of source snapshot    """    LOG.debug(_LI('acme: create_volume_from_snapshot(%s)'), snapshot)
    snapshot_uuid = snapshot['provider_id']
    LOG.debug(_LI('acme: clone %s'), snapshot_uuid)
    share_uuid, uris = acme_create_share(self._get_volume_name(volume),
                                             snapshot_uuid)

    # Store share UUID in provider_id field    addr, share_path = self._parse_uri(uris[0])
    if self.__pool_address:
        addr = self.__pool_address
    provider_location = addr + ":" + share_path
    volume['provider_location'] = provider_location
    volume['provider_id'] = share_uuid
    self._mount(volume)

    # Get origin volume of the snapshot    orig_volume = self.__get_snapshot_volume(snapshot)
    try:
        self._mount(orig_volume)
    except Exception as e:
        LOG.warning(_LW('acme: failed to unmount volume: %s: %s'),
                    orig_volume.id, e)
        try:
            self._unmount(volume)
        except Exception as e1:
            LOG.warning(_LW('acme: failed to unmount volume: %s: %s'),
                        volume.id, e1)
        raise
    # Rename the volume file    # Expected file name is based on volume['name'] but after the clone    # we have orig_volume['name']    new_name = self.local_path(volume)
    old_name = path.join(self.__get_mount_point(volume),
                         orig_volume['name'])

    try:
        rename(old_name, new_name)
    except Exception as e:
        LOG.warning(_LW('acme: rename failed: %s'), e)
        raise    finally:
        try:
            self._unmount(volume)
            self._unmount(orig_volume)
        except Exception as e:
            LOG.warning(_LW('acme: failed to unmount volume: %s'), e)

    LOG.debug(_LI('acme: Created volume %s'), volume)

    return {'provider_location': provider_location,
            'provider_id': share_uuid,
            }

def create_cloned_volume(self, volume, src_vref):
    """    Creates a clone of the specified volume.
    :param volume: new volume reference    :param src_vref: source volume reference    """    LOG.info(_LI('Creating clone of volume: %s'), src_vref['id'])
    snapshot = {'volume_name': src_vref['name'],
                'volume_id': src_vref['id'],
                'name': 'cinder-snapshot=%(id)s' % volume,
                }
    # We don't delete this snapshot, because this snapshot will be origin    # of new volume. This snapshot will be automatically promoted by NMS    # when user will delete its origin.    self.create_snapshot(snapshot)
    try:
        return self.create_volume_from_snapshot(volume, snapshot)
    except Exception as e:
        LOG.warning(_LW('acme: snapshot creation failed: %s'), e)
        self.delete_snapshot(snapshot)
        raise