elarson’s posterous

 
Filed under

mercurial

 

My Org-Mode Server

I went ahead and started working out how to get my todo list online. I started off pretty simple and ended up with a relatively nice system. The basic idea is that I can push my org files to my webserver and edit them. Likewise, I can pull from the server. It started with some simple paver scripts that uploaded the files and quickly became an actual application.

Here is the paver file for some of the operations:


import os
from mercurial import commands, ui, hg
from paver.easy import *
import subprocess

IONROCK_HG = 'ssh://eric@ionrock.org/path/to/todos/'
REMOTE_TODO = IONROCK_HG # '/local/dev/path/to/todos'

@task
def server():
    import cherrypy
    cherrypy.tree.graft(TodoServer(base_url='/'), '/')
    cherrypy.quickstart()

@task
def create_repo():
    cmd = subprocess.call("fab create_repo:hosts='ionrock.org'", shell=True)

@task
def commit():
    conf = ui.ui()    
    user = conf.username()
    repo = hg.repository(conf, '.')
    files = [f for f in os.listdir('.') if f.endswith('.org')]
    commands.add(conf, repo, *files)
    commands.commit(conf, repo, addremove=True, message='Syncing org files')
    commands.push(conf, repo, REMOTE_TODO)

@task
@needs('commit')
def pull():
    conf = ui.ui()    
    user = conf.username()
    repo = hg.repository(conf, '.')
    commands.pull(conf, repo, REMOTE_TODO)
    commands.update(conf, repo)

@task
@needs('commit')
def update():
    cmd = subprocess.call("fab update_todos:hosts='ionrock.org'", shell=True)
    

The server task was for starting the eventual web application for development. The commit task just automatically commits the current org files and pushes them to the remote server. The pull command does the commit first, then pulls from the remote server. These two commands uses the mercurial libraries to work with the mercurial repos.

The create_repo was just a simple task to create an mercurial repo. More interesting is the update task which updates the remote todo mercurial repo. I'm using fabric for this aspect. It was all really easy. Here is the fabfile:


from fabric import run

def update_todos():
    run('cd /home/eric/htdocs/todo && hg up')

def create_repo():
    run('cd /home/eric/htdocs/todo && hg init')

Hopefully it is really clear what is happening here. Fabric lets you run commands via ssh on a remote server.

The actual todo server is a bit longer but also pretty simple.


import os
import re
import posixpath as path
import difflib
from selector import Selector
from webob import Response, Request
from webob.exc import *
from mercurial import commands, ui, hg
import datetime


class TodoFile(object):
    def __init__(self, fn):
        self.fn = fn
        self.html_diff = difflib.HtmlDiff()
        self.diff = difflib.Differ()
        self.matcher = difflib.SequenceMatcher()
        lines = [l for l in open(fn, 'r')]
        self.matcher.set_seq2(lines)

    def _hg(self):
        conf = ui.ui()
        user = conf.username()
        repo = hg.repository(os.path.dirname(self.fn))
        return conf, repo, user

    def __str__(self):
        return ''.join(self.read())
    
    def read(self):
        return [l for l in open(self.fn, 'r')]

    def write(self, new):
        f = open(self.fn, 'w')
        clean = re.sub('\r', '', new)
        f.write(new)
        f.close()
        conf, repo, user = self._hg()
        date = datetime.datetime.now().strftime('%m-%d-%y %H:%M')
        commands.commit(conf, repo, message='Web write on %s' % date)
        
    def is_different(self, new):
        self.matcher.set_seqs(new.split('\n'), self.read())

    def diff_txt(self, new):
        return list(difflib.context_diff(new.split('\n'), self.read()))

    def diff_html(self, new):
        return self.html_diff.make_file(self.read(), new.split('\n'))        
    

class TodoStore(object):
    def __init__(self, directory):
        self.dir = os.path.abspath(directory)

    def get_todo(self, name):
        for fn in os.listdir(self.dir):
            if fn.endswith('.org') and (fn[:-4] == name):
                return TodoFile(os.path.join(self.dir, fn))
        return false

    def all(self):
        return [fn[:-4] for fn in os.listdir(self.dir) if fn.endswith('.org')]


class Auth(object):
    def __init__(self, creds, login_url, success_url=None):
        self.login_url = login_url
        self.success_url = success_url
        self.creds = creds

    def __call__(self, f):
        def func(env, sr):
            sess = env['beaker.session']
            if sess.get('auth.user'):
                return f(env, sr)
            req = Request(env)
            sess['auth.after_login_url'] = req.url
            sess.save()
            return HTTPSeeOther(location=self.login_url)(env, sr)
        return func

    def login(self, env, sr):
        res = Response()
        sess = env['beaker.session']
        flash = sess.get('flash', '')
        if flash:
            sess['flash'] = ''
            sess.save()
        res.write('''<div>%s</div>
        <form action="%s" method="post">
          <label for="username">Username</label>
          <input type="text" name="username" value=""><br />
          <label for="password">Password</label>
          <input type="password" name="password" value=""><br />
          <input type="submit" value="login" />
        </form>''' % (flash, self.login_url))
        return res(env, sr)

    def handle_login(self, env, sr):
        req = Request(env)
        post = req.POST
        sess = env['beaker.session']        
        if post.get('username') and post.get('password'):
            if self.creds.get(post['username']):
                if self.creds[post['username']] == post['password']:
                    sess['auth.user'] = post['username']
                    url = sess.get('auth.after_login_url', self.success_url)
                    sess.save()
                    return HTTPSeeOther(location=url)(env, sr)
        sess['flash'] = 'Error logging in.'
        sess.save()
        return HTTPSeeOther(location=self.login_url)(env, sr)
            

class TodoServer(object):

    def __init__(self, **config):
        self.conf = {
            'todo_dir': os.path.dirname(os.path.abspath(__file__)),
        }
        self.conf.update(config or {})

        self.auth = Auth(self.conf.get('creds', {}),
                         self.url('login'),
                         self.url())
        

        self.store = TodoStore(self.conf['todo_dir'])

        self.router = Selector([
            ('[/]', {'GET': self.listing}),
            ('/login[/]', {
                'GET': self.auth.login,
                'POST': self.auth.handle_login
            }),
            ('/{name}/edit[/]', {
                'GET': self.edit,
                'POST':  self.auth(self.update)
            }),
            ('/{name}[/]', {'GET': self.read}),
        ])

    def url(self, extras=None):
        extras = extras or ''
        if isinstance(extras, list):
            extras = '/'.join(extras)
        return path.join(self.conf['base_url'], extras)

    def _header(self):
        return '''<html><head>
        <title>org todo server</title>
        <style type="text/css">
        body {
            font-size: 2em; font-family: sans-serif;
        }
        </style>
        '''

    def _footer(self):
        return '''</body></html>'''

    def edit(self, env, sr):
        res = Response()
        req = Request(env)
        name = req.urlvars['name']
        td = self.store.get_todo(name)

        res.write(self._header())
        res.write('''
        <form action="%s" method="post">
        <input type="submit" name="submit" value="save" /><br />        
        <textarea rows="50" cols="80" name="new_body">%s</textarea>
        </form>
        ''' % (self.url('%s/edit' % name), str(td)))
        res.write(self._footer())
        
        return res(env, sr)

    def update(self, env, sr):
        req = Request(env)
        name = req.urlvars['name']
        new_body = req.POST['new_body']
        todo = self.store.get_todo(name)
        todo.write(new_body.strip())
        location = self.url('%s' % name)
        return HTTPSeeOther(location=location)(env, sr)

    def read(self, env, sr):
        res = Response()
        req = Request(env)
        name = req.urlvars['name']

        res.write(self._header())
        res.write('''
        Home | Edit
        <hr />
        <pre>''' % (self.url(), self.url('%s/edit' % name)))
        td = self.store.get_todo(name)
        res.write(str(td))
        res.write('</pre>')
        res.write(self._footer())
        
        return res(env, sr)

    def listing(self, env, sr):
        res = Response()

        res.write(self._header())
        res.write('<ul>\n')
        for f in self.store.all():
            res.write('<li>%s</li>\n' % (self.url(f), f))
        res.write('</ul>\n')
        res.write(self._footer())
        
        return res(env, sr)

    def __call__(self, env, sr):
        return self.router(env, sr)


This is a WSGI app simply because I'm using WSGI for my main application. I save a bit of memory by running all my smaller apps via one WSGI server (CherryPy), which makes a difference as I use a VPS.

One observation I made is that things would have been simpler had I been able to use CherryPy. Things like sessions, form processing and even URL routing would have been built in and made the whole thing a lot simpler in terms of dependencies and actual code.

This also made me realize what the problem is building applications with WSGI. You really need a framework. I don't mean Pylons, web.py or some other WSGI framework. But you will undoubtedly write some glue code to help handle things like request and response objects that help to deal with form handling, sessions and cookies. It is nice to know that it is so easy to create these micro frameworks, but at the same time, it is clear that people would be making bad decisions. I only say that because I'm one of them.

When I think of the micro frameworks I've written throughout the past few years, it is clear that I've had to experiment quite a bit. WebOb was a helpful library for sure, but the API you build translating the request to a WebOb Request means breaking WSGI at some level. That means that you've lost the advantages of WSGI as an API for your application. In my mind, it makes me wonder why then the app was written with WSGI in the first place as there is a solid and proven API already built with something like CherryPy.

I doubt I'll rewrite my whole site anytime soon, but if I do, the application framework will most definitely revolve around the framework rather than WSGI. The advantages that I believed were present ended up being much less than I thought. Having a tool like CherryPy manages to take care of the generic aspects enough while letting me use more opinionated aspects such as templating or databases. You could most certainly substitute your framework of choice, but for me CherryPy is making more and more sense.

Loading mentions Retweet
Filed under  //   emacs   mercurial   python  

Front Loaded Mercurial

I'm going to have to go back and see how I can avoid laying a big fat patch bomb on a repo and I'm not happy about it. There is no one to blame but myself. That doesn't make it any nicer. My big issue is that for all the cool features of Mercurial there is a consistent front loading requirement. You cannot simply work and then later construct your commits that you'll be pushing. MQ does help with this sort of thing and I'm going to have to find out just how much tomorrow, but it would have been really nice if I could have started coding and whan I finished have a convenient way to go through all the files and commit them in reasonable chunks.

The astute reader will recognize that this issue really just a sign of bad DVCS habits and I'm not about to argue otherwise. Still, I'm very much a part of the "not a great coder" club, and as such, seem like a good candidate for how to help out the normal developers using these powerful tools. One might also suggest that I open a ticket, or even better, contribute a patch. Again, my "not a great coder" club membership explicitly states that any gripes need to stay far away from those folks getting a lot of work done (a la the mercurial devs), hence I'm totally fine leaving my whining here on my blog. My bet is bringing it up here will do more to improve my own habits than suggesting to others they are real problems.

Next time I'm really going to do a better job manaing my patches. Feel free to hold my feet to the fire in the future seeing how I've done.

Loading mentions Retweet
Filed under  //   mercurial   programming  

Groking Mercurial Queues

I've mentioned plenty of times that we use Mercurial at work. Well the other day Dowski mentioned that he started playing with Mercurial Queues (mq) and found it to be pretty helpful. As I'm always interested to see where these more advanced DVCS concepts lead, I gave it a try.

My basic use case is that I'm working on a feature and I have to change gears to work on something else. This usually involves pulling from the main repo, doing whatever new work I need to do and pushing it back before going back to where I left off. When using mq, the idea is that you are working on a set of patches locally that, when finished, you'll push or send to someone. The way these patches are organized is in a stack. Each patch gets a name and they can be applied in a specific order, with the age of the patch being the default.

So, the smart way to start would be to create a new patch for a new feature. In my case, this is rarely the case, so I have to figure out another way. I usually have some commits and a bunch of stuff I didn't commit just yet, so here's what I do:



> hg ci -m 'Commit my outstanding change before pulling'
> hg log -l 5 # check for the latest revision I want to effectively "rebase" to 
> hg export -r $REV > working.patch
> hg strip $REV # this reverts the entire repo back to the point of the revision
> hg pull -U # get the latest


At this point my repo is all up to date and my only record of my previous work is my patch file I exported. Not very cool, so lets fix that first by storing my work in the queue.



> hg qinit -c # this versions the patch queue so you can save patches even after their done
> hg qnew -m 'Add my working patch for later' working-feature
> patch -i working.patch -p1
> hg qrefresh


While it make not be clear what happens is the patch gets saved on the queue in a patch called "working-feature". The qnew creates the patch, the changes are made and the qrefresh updates the current patch with the changes. It should also be noted that this updates the patch's changeset in the repo. These patches are actual changesets so you can use the other hg commands. The bisect command is usually given as a good example, but anything should work such as hg diff.

Now that I have patch in the queue, I can see what has happened.



> hg qapplied


This will tell me what patches have been applied. There are effectively two stacks involved. There is the applied stack and the series stack. The applied stack is considered to be applied to the repo, meaning the source files contain the changes in the patch. The series stack lists all the patches in the queue whether they have been applied or not.

In this example, I want to take my curent work, the working-feature patch, and not have it applied since the changes are unfinished. I then want to make changes for some other feature, commit them, push and finally start back where I left off.



> hg qpop # takes my currently applied patch and un-applies it, yet keeps it in available patches in the queue
> hg qnew -m 'A hotfix for production' fix-for-bug # create a new patch for my fix


# fix the bug!


> hg qrefresh # update my bug fix patch
> hg qnew -m 'Update the tests for the bug' fix-for-bug-tests # create patch updating the tests


# update the tests


> hg qrefresh # update my second patch


# ready to commit!


> hg qfinish fix-for-bug && hg qfinish fix-for-bug-tests 
# another option
> hg qfinish -a # turn all the applied changesets into real commited changesets
> hg push
> hg qpush working-feature # back to where we left off


Things can obviously get more complicated than this, but off hand it makes a good deal of sense once you give it a try. I should also mention that there might be better ways to deal with external patches like I did initially. There is a qimport command (for example) that seems promising assuming you're working with patches people send.

The biggest benefit of using mq was that it was relatively painless to go back and make a new patch. This may seem like a small issue, but for me knowing that I can quickly save my work for later without branching or cloning is a huge win. My biggest complaint always ends up being that I've gone and written code, committed (b/c that is the point of being distributed) and found that I'll have to jump through hoops to only push a certain set of changes that are finished. MQ has always been presented as the way to handle my use cases, but experimenting never yielded the confidence to commit to using it. At this point, I feel much better about using mq consistently and integrating it into my workflow. Also, thanks to the folks in #mercurial on freenode for all the explanations and helping me wrap my head around what is happening.

Loading mentions Retweet
Filed under  //   mercurial   programming  

More Fun with Mercurial

The other day I had something of an issue at work. I was working on retooling our testing environment when there was a need to provide a fix for something in production. I couldn't reproduce the issue, so I decided to add some extra logging to help try and gather some data on the issue. With the code in place, it became clear that I didn't know how I was going to move those changes to the production repo while keeping my other work safe.

After looking into the issue further, I thought rebase might be helpful. Rebase is a great extension, but it wasn't going to provide a fix (that I know of). The rebase extension allows you to choose the order of two existing heads. The classic example is when you are working on a feature, you pull to get the most recent changes and you want to upgrade to the latest from the remote repo, while keeping your changes "in front" or after the pulled code in the history. My description might be a little off, but it was how I understood the process.

In my situation the scenario was as if I already rebased and did so incorrectly. Fortunately, the transplant extension came to the rescue. What I wanted to do was effectively recreate my local repo and correct the order of commits so my unfinished work was "in front of" my production fix. To put things plainly, I had a sequence of commits 'ACB' where 'C' was unfinished, so I wanted to move it to the front and have 'ABC'. What I gathered is it is not really possible to reorder the commits since the time is always attached to the changeset. But, I was able to push my production fixes without having to push my working changes, which was good enough for me.

I started by cloning the remote repo. Then I transplanted the production changesets I needed from my local repo. Then I pushed back to the remote repo. I then transplanted the rest of my changes to the new clone. Just for good measure I pull my new remote changes into my local repo and merged to see what would happen in terms of history. It actually made it clear that things had been transplanted at different points in time and reordered. Here is what it looked like:



> ls
local
> hg clone ssh://user@remote/hg/repo remote
> ls 
local remote
> cd remote
> hg transplant -s ../local 
... interactively choose changesets to apply ...
> hg transplant -s --continue # if any merges failed
> hg push


Being able to push my production changes without having to also push my working changes means the person doing the release can merge with default without having to exclude my working changesets. This doesn't seem like a huge win, but I think it is pretty helpful way to avoid someone working with changesets they didn't write themselves. It seems like it is a decent work flow as well. Keeping your own "production" or "pusher" repo as an intermediary for a remote production repo can be a helpful way of making sure you introduce atomicity while still keeping your changes in VC. I've found the more commit points you create, the easier it is to see where things might have gone wrong. The downside is that your changes might become interspersed with other changes. Rebase definitely helps this case and I believe using a local production repo for pushing also provides another means of keeping merges simple and obvious.

Loading mentions Retweet
Filed under  //   mercurial  

Transplanting with Mercurial

At work we use Mercurial. I don't know that we will keep using it as we are a rather global company and some of the other teams don't have the time adopt a new VCS that is much more complicated than existing systems. Despite mercurial's mixed reviews among the team, I'm becoming more of a fan. I can't say I'm really a fan of mercurial per se, but it is becoming clear how a DVCS is beneficial in a more intimate way. There are the traditional arguments surrounding things like "commit on a plane" and "branching made easy" but I don't think people totally see the impact until they really have to work with a tool like mercurial for an extended period of time. It doesn't mean it's easy by any means, but after a while there are definite advantages.

One of the benefits of a DVCS is the ability to take a set of changes and place them in another branch. This is not as simple as it sounds. There are a suite of things to consider and even more potential data to keep track of. Where did the patch come from? Can you revert the changes to a different version that existed before the current version was added? If it is a set of patches or changesets, do you get to revert specific changes or is it an all or nothing kind of operation? Is there now a permanant link between the two branches/tags/heads after copying over the changes? How would that even work?!

When you start limiting things a bit, the idea becomes manageable. Mercurial has a plugin called transplant that makes some decisions. You don't necessarily get massive amounts of information which makes it relatively simple to move changesets around without much hassle. It also moves the changeset around as an atomic entity, which means that after you've transplanted, you don't need to commit or add a message saying you transplanted things. All in all, it is pretty easy once you get the hang of it.

To do a transplant first you need a repo. We are going to do everything in place, which means we are not going to clone to another directory, creating an implicit new branch.



elarson $ echo 'print "hello world!"' > hello.py
elarson $ hg add hello.py 
elarson $ hg branch 1.0
marked working directory as branch 1.0
elarson $ echo 'print "goodbye world!"' >> hello.py
elarson $ hg st
A hello.py
elarson $ hg ci -m 'ended'
elarson $ hg id
020db5c02665 (1.0) tip
elarson $ hg branch 2.0
marked working directory as branch 2.0
elarson $ echo 'print "wait... ah nvmd"' >> hello.py 
elarson $ hg ci -m 'nvmd'
elarson $ hg up 1.0
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
elarson $ echo 'print "talk to you again later"' >> hello.py
elarson $ hg st
M hello.py
elarson $ hg ci -m 'tty'
created new head
elarson $ hg id
65a5be09f306 (1.0) tip
elarson $ hg heads
changeset:   2:65a5be09f306
branch:      1.0
tag:         tip
parent:      0:020db5c02665
user:        Eric Larson <eric@ionrock.org>
date:        Mon Feb 09 21:10:41 2009 -0600
summary:     tty

changeset:   1:93127fc79160
branch:      2.0
user:        Eric Larson <eric@ionrock.org>
date:        Mon Feb 09 21:09:38 2009 -0600
summary:     nvmd

elarson $ hg up 2.0
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
elarson $ hg transplant -b 1.0 2
applying 65a5be09f306
65a5be09f306 transplanted to 9cbf35e4f623
elarson $


In the example we made two branches and both had some work in them. In the real world, this is as if you are working on a new release (2.0) and you fixed a bug in a previous release (1.0) that you need to forward port to your new release branch. If there were a series of commits you'd just do something like

hg transplant -b 1.0 3:5 7
. That would transplant the changesets 3 through 5 and also changeset 7. If there are conflicts you will get to merge as usual. For example, if you use Emacs (really, what else would you use?) ediff should come up with the merge interface and you can move along.

Also, I should mention there is something to be said for being able to work in the same directory all the time. As a Python developer, I use virtualenv, but for day to development, it can be much easier to just keep your system some what bleeding edge and only use virtualenv's for specific projects or sandboxes. It is nice to have your server running, hg up to some branch to test and see your server restart and be ready to go. It is a small issue, but once you get used to it, it is pretty convenient.

If you are using mercurial I hope you spend some time trying to learn the more detailed aspects of it. The concept of heads, while trying, is pretty helpful at times. There are also a host of plugins that can be helpful. For example, Mercurial Queues is one that consistently comes up when comparing Git and rebasing. I've found queues to be extremely confusing, but transplanting has worked for me. There are also other plugins like Local Branch that seem pretty nice. A DVCS raises the complexity bar in terms of possible work flows, and there is a pretty good chance that whatever DVCS you choose, there should be a way to make it work. For me, transplant works.

Loading mentions Retweet
Filed under  //   mercurial   python