Migration
=========

Let's say we have created a blog post which look like this::

    >>> from mongokit import *
    >>> con = Connection()

    class BlogPost(Document):
        structure = {
            "blog_post":{
                "title": unicode,
                "created_at": datetime,
                "body": unicode,
            }
        }
        default_values = {'blog_post.created_at':datetime.utcnow}


Let's create some blog posts:

    >>> for i in range(10):
    ...     con.test.tutorial.BlogPost({'title':u'hello %s' % i, 'body': u'I the post number %s' % i}).save()

Now, development goes on and we add a 'tags' field in our `BlogPost`::

    class BlogPost(Document):
        structure = {
            "blog_post":{
                "title": unicode,
                "created_at": datetime,
                "body": unicode,
                "tags":  [unicode],
            }
        }
        default_values = {'blog_post.created_at':datetime.utcnow}

We're gonna be in trouble when we'll save a fetched document because the
structure don't match::

    >>> blog_post = con.test.tutorial.BlogPost.find_one()
    >>> blog_post['blog_post']['title'] = u'Hello World'
    >>> blog_post.save()
    Traceback (most recent call last):
        ...
    StructureError: missed fields : ['tags']

If we want to fix this issue, we have to add the 'tags' field manually in all
`BlogPost` of the database::

    >>> con.test.tutorial.update({'blog_post':{'$exists':True}, 'blog_post.tags':{'$exists':False}},
    ...    {'$set':{'blog_post.tags':[]}}, multi=True)

and now we can save our blog_post::

    >>> blog_post.reload()
    >>> blog_post['blog_post']['title'] = u'Hello World'
    >>> blog_post.save()

Lazy migration
--------------

.. IMPORTANT::
    You cannot use this feature if `use_schemaless` is set to True

Mongokit provides a convenient way to set migration rules an apply them lazily.
Here's how to do, we use the previous example.

Let's create a `BlogPostMigration` which inherit from `DocumentMigration`::

    class BlogPostMigration(DocumentMigration):
        def migration01__add_tags_field(self):
            self.target = {'blog_post':{'$exists':True}, 'blog_post.tags':{'$exists':False}}
            self.update = {'$set':{'blog_post.tags':[]}}


How does it work ? All migration rules are simple method in the
`BlogPostMigration`. They must begin by `migration` and be numeroted (so they
can be applied in certain order). The rest of the name should describes the
rules. Here, we create our first rule (`migration01`) which add a 'tags' field
into our `BlogPost`.

Then you must set two attribute : `self.target` and `self.update`. There's both
mongodb regular query.

`self.target` will tell mongokit which document will match this rule. So, any
document which match this query will would get a migration.

`self.update` is a mongodb update query with modifiers. This will describes
what update should be apply to the matching document.

Now that our `BlogPostMigration` is created, we have to tell Mongokit to what
document thoses migration rules should be applied.  To do that, we have to set
the `migration_handler` in `BlogPost`::

    class BlogPost(Document):
        structure = {
            "blog_post":{
                "title": unicode,
                "created_at": datetime,
                "body": unicode,
                "tags": [unicode],
            }
        }
        default_values = {'blog_post.created_at':datetime.utcnow}
        migration_handler = BlogPostMigration

Each time that an error is raised while validating a document, migration rules
are applied to the object and the document is reloaded.

.. CAUTION::
    if `migration_handler` is set then `skip_validation` is deactivated.
    Validation must be on to allow lazy migration.

Bulk migration
--------------

Lazy migration is usefull if you have many document to migrate because update
will lock the database. But sometime, you might want to make a migration on few
documents and you don't want slow down your application with validation. You
should then use bulk migration.

Bulk migration work like lazy migration but `DocumentMigration` method must
start with `allmigration`. Because lazy migration add document `_id` to
`self.target`, with bulk migration, you should provide more information on
`self.target`. Here's an example of bulk migration, finally, we wan't to remove
the `tags` field from `BlogPost`::

    class BlogPost(Document):
        structure = {
            "blog_post":{
                "title": unicode,
                "creation_date": datetime,
                "body": unicode,
            }
        }
        default_values = {'blog_post.created_at':datetime.utcnow}

Note that we don't need to add the `migration_handler`, it is required only for
lazy migration.

Let's edit the `BlogPostMigration`::

    class BlogPostMigration(DocumentMigration):
        def allmigration01_remove_tags(self):
            self.target = {'blog_post.tags':{'$exists':True}}
            self.update = {'$unset':{'blog_post.tags':[]}}


To apply the migration, instanciate the `BlogPostMigration` and call the
`migrate_all` method::

    >>> migration = BlogPostMigration(BlogPost)
    >>> migration.migrate_all(collection=con.test.tutorial)


.. NOTE::
    because `migration_*` methods are not called with `migrate_all()`, you
    can mix `migration_*` and `allmigration_*` methods.

Migration status
----------------

Once all your documents have been migrated, some migration rules could become
deprecated. To know which rules are deprecated, use the `get_deprecated` method::

    >>>> migration = BlogPostMigration(BlogPost)
    >>> migration.get_deprecated(collection=con.test.tutorial)
    {'deprecated':['allmigration01__remove_tags'], 'active':['migration02__rename_created_at']}

Here, we can remove the rule `allmigration01__remove_tags`.


Advanced migration
------------------

Lazy migration
~~~~~~~~~~~~~~

Sometime, we might want to build more advanced migration. For instance, say you
want to copy a field value into another field, you can have access to the
current doc value via `self.doc`. In the following example, we want to add an
`update_date` field and copy the `creation_date` value into it::

    class BlogPostMigration(DocumentMigration):
        def migration01__add_update_field_and_fill_it(self):
            self.target = {'blog_post.update_date':{'$exists':False}, 'blog_post':{'$exists':True}}
            self.update = {'$set':{'blog_post.update_date': self.doc['blog_post']['creation_date']}}


Advanced and bulk migration
~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you want to do the same thing with bulk migration, things are a little differents::

    class BlogPostMigration(DocumentMigration):
        def allmigration01__add_update_field_and_fill_it(self):
            self.target = {'blog_post.update_date':{'$exists':False}, 'blog_post':{'$exists':True}}
            if not self.status:
                for doc in self.collection.find(self.target):
                    self.update = {'$set':{'blog_post.update_date': doc['blog_post']['creation_date']}}
                    self.collection.update(self.target, self.update, multi=True, safe=True)

In this example, the method `allmigration01__add_update_field_and_fill_it` will
directly modify the database and will be called by `get_deprecated()`. But calling
`get_deprecated()` should not arm the database so, we need to specify what portion
of the code must be ignored when calling `get_deprecated()`. This explain the
second line.
