elarson’s posterous

 

Issues with TDD

I'm making a concerted effort to do a lot more testing at work. Up until this point, tests have always been an afterthought. Something to provide a sanity check that helps to be sure things don't randomly break. There was obviously a little voice in my head nagging me that I should do a better job testing. The goal is to create a more stable environment where we can better evaluate a release. This is an important goal because it establishes the real point of testing, providing better software.

One observation I've found is that unless you start doing TDD or have some sense of what the tests should look like, it can be very difficult to get tests in order. Some might argue that if the tests are difficult to write, it is indicative of the code needing some refactoring. I honestly couldn't tell you if that is true, but my gut suggests that might very well be the case. On the other side of the spectrum, great tests do not make users happy through more usable applications or libraries. There has got to be a balance.

One thing I'm noticing in our own application is a set of global state that seems to always muck up the works. It can be a pain in the neck to always pass around variables instead of relying on a global. You have to consider if that mutable variable you just gave to one object will be modified and potentially affect another object's use of the variable. One solution is to try to pass variables where ever possible, but at that point we've somewhat missed some of the advantages of objects. With that in mind the globals we do have seem relatively benign as they traditionally are wrapping some storage or persistence piece that doesn't logically make sense to always pass around. This doesn't even begin to answer the questions of handling thread safety.

When I start thinking about all these things it really makes me frustrated. It is definitely a morale killer because improving the code slows momentum to a crawl. You want to refactor things to get something more testable and reliable, yet in doing so, you can easily create regressions by changing the previously working tests. Along the same lines refactoring tests seems like a recipe for disaster as you are changing the one bit of code that stands as a benchmark for functionality and quality. Today was particularly frustrating because it felt like every time I started working on cleaning up some piece of code, it became intertwined in the code I was trying to isolate. At this point it seems clear that there is real need, which is a good thing. Hopefully tomorrow things might be a bit clearer on how to start untangling things in a way actually helps improve the reliability of the code and makes for a better user experience.

Loading mentions Retweet
Filed under  //   programming   python  

Introducing Focusr

One aspect of time management that is critical to success is finding a way to focus on tasks. For many people, myself included, it is a pretty serious battle that takes tons of practice and creative techniques for fooling yourself to stick to the task at hand. One such technique is the Pomodoro Technique. I haven't read the book or would consider myself an expert by any stretch, but the basic idea seems simple enough to run with it despite formal training.

In a nutshell, you give yourself 25 minutes to complete a task and then take a short 5 minute break before moving onto the next task. From what I understand, the book emphasizes using an egg timer that is visible to make the whole process convenient. Seeing as I'm a programmer and there are multitude of ways built into my desktop to get my attention, it seemed like a good opportunity to create a simple tool.

The result is Focusr. This is really simple timer that helps to complete Pomodoro like cycles. You say you want to start a task, it starts the 25 minute timer, lets you know when the times up and does the same for the break. Rinse and repeat. It is super simple and surprisingly effective.

You can grab it from the web or install it with easy_install or pip. It uses libnotify's

notify-send

command to do the actual notification. Also, I created a simple Emacs function so I could start it easily.

 
(defun pomo ( ) 
 "Start a pomodoro task 25 minutes working and 5 off" 
 (interactive) 
 (setq msg (read-string "What do you want to work on? ")) 
 (setq cmd (concat "focusr " msg)) 
 (comint-simple-send (make-comint "pomodoro-task" "bash") cmd)) 

While I'm sure buying the book could be helpful, it seems more helpful to understand what Pomodoro is actually doing. For myself, it presents a attainable period of time focus on a task. I've read over and over again that one key to better productivity is breaking large tasks into smaller tasks. This is easier said than done though. By taking on the day in 25 minute chunks you're forced to consider how you can break up tasks such that you finish a task with in the time limit. In addition to getting better practice breaking up tasks, you also are exercising your estimation skills and getting a better understanding of how much work you can really do.Like I said before, the concepts are really simple with or without formal training.

For myself, I also appreciate the obvious openness of the system. Becoming more productive is partly effectively utilizing systems while always evolving your techniques. As a person you have an innate ability to hack around your own efforts. I think this technique is simple enough that it can be used many different ways to help keep your mind guessing, which in turn helps to truly learn how to get more focus.

Loading mentions Retweet
Filed under  //   productivity   programming   python  

One Problem with Music Piracy

I read this article by Lily Allen on music piracy. While I'd like to think that music should be free, after looking at a few offers we've received from labels it is difficult to deny that music piracy has been considered part of the equation.

The big issue is not that record labels are struggling. This is a good thing. Labels have been loan sharks for entirely too long so good riddance. But, as crappy as big labels can be, there are a wealth of smaller labels that do aim to release great music to people because they appreciate what artists are doing. These range from friends that have friends in bands they want to help out, all the way up to labels who have established themselves as viable businesses. For bands, often times a label is the sole party that really can break a band.

There are are plenty of other ways a band can break, but the most common is through a label. The label puts out a record, pays for press, helps improve the bands stature and spreads the word with the result being the band finds a fanbase. The most important thing in this whole scenario for a band is finding the fans. While the critics can be nice to sway and Carling massive attention on blogs has its perks, the real deal is to get your music to fans. As a band, the closer you get to your fans and the more direct relation you build, the better the payout.

In the past bands could make money selling records. It was a big lottery, but it was still possible. Bands also were given the responsibility to do with their money as they chose. A band gets a big advance from the label and uses the funds to build their business (or buy drugs). Now, labels don't make nearly as much money selling music, so they are looking for other revenue streams. This means "partnering" up with an artist. The negative term has become the 360 deal where a label effectively can act as a manager, taking a percentage of the gross income. The more common occurrence from what I've seen is that labels aim to find targeted income areas to cover costs through the artist.

This in and of itself isn't too bad. It can be a huge help keeping the band funds liquid. If a band lands a huge tour or gets massive success, every t-shirt size they don't have is money lost. If a label can make sure you have enough shirts, hoodies and vinyl, that is a definite benefit. Labels can also help find placements and synchs for songs, which can be very lucrative. Again, very nice.

The problem is that while these benefits can be helpful, they can also preclude you from building your own business. If your label is your merchandising outlet, you probably can't make t-shirts yourself. If you have a three record deal, that is probably 3-5 years lost building part of your business. You won't be able to build up your own relationships with another manufacturer or put together your own fulfillment facilities. Likewise, if your label owns some of your publishing, that can end up being a lot of money. Say you have one song that gets 5 placements a year. If you average $2k per placement that is $10k. Over 20 years, that is $200K, which seems pretty good. Take 25% and give it to the label, pay another 20% in taxes and split it up between band members and it quickly becomes rather depressing. Couple that with the fact some labels are asking for actual percentages of tour revenue as well. Also realize that the band is most likely having to pay a lawyer, manager and booking agent (and in some cases a tour manager). Many of these support staffs used to take flat rates, but now all focus on percentages.

Before, if a band got 12% of a record sale they were doing pretty well. They could go on the road and make good money. Now, you don't get any money from record sales even though your percentage might be 50%. In addition to that, you are effectively taxed by your "team" on everything else you do. 5% for your lawyer, 10-15% for your manager, 10-15% for your booking agent, 10-20% for a tour manager, and 10-15% from your label. So, between 45% and 70% of the money you make disappears before you see it. Even if you do "break" and get a decent following, getting $5k a show doesn't mean much over the course of a year. In the famous words of Bill Cosby, "You have eaten yet!" Even if you take home half of the $5k for 30 dates ($2500 * 30 = $75k) you've yet to pay for gas, food, lodging, equipment and taxes. When you consider this money has to cover all the band members living expenses both on the road and off, things start to get rather tight. This doesn't even consider off road maintenance such as wardrobes and props (fog machines and can lighting is not cheap).

I honestly have to agree with Lily that music piracy doesn't really help the artists. It does help destroy the old system, which I do think is a good thing. But, unless there are some provable models for being an artist that don't involve taxes by everyone that helps you break, it doesn't seem like it is any better for artists. In a sense it is potentially worse because at least before, labels sold a product. The initial investment or R&D were directly tied to selling music. Now, everyone wants to get shares in the artists career.

The music business has been compared to the tech industry and venture capital, but I don't think that comparison is quite right. Music doesn't scale the same way technology does. Craigslist is a great example where a really small group of people have been able scale a small business from one city to a whole country. A band simply can't all of a sudden put out an album a day while touring non-stop. They need others to help keep the message out there and no one works for free. The result is that in order to break and make enough money to be reasonably stable, the actual revenue needs to be pretty huge.

With all this said, the fact still remains that I love playing music. It is an honor that people want to invest money in the music that I help create. While we plan on trying to be as shrewd as possible in terms of music as a business, the performers and artists inside just want to get our music out to as many people as possible. It is really all about fans and making that connection, so if that means giving up some ownership in order to help get our name out there, I'm kind of OK with that. The only requirement is that if you want a share of our music, you need to really earn it and be a fan.

Loading mentions Retweet
Filed under  //   music  

Experiments with Diesel: Repeater

It feels as though web development has begun to focus on other parts of the stack. Up until recently, the framework decisions seem to be the biggest focus, with MVC based patterns reigning in the masses. At this point though, there is a wealth of documentation and options that make trying out the latest and greatest MVC framework slightly passe. To combat this stagnation in hip web technology, the focus has changed slightly to server technology.

Here are few recent articles to make point regarding the state of web development:


While most of these ideas and concepts are not relatively new in the realm of computer science, they are new to me. I represent a rather large audience of web developers who did not necessarily see web develop from socket libraries to Rails. Instead, my experience began with PHP and didn't include any understanding of what actually happens on the web. Fortunately though, my own forays into web development exposed some consistent patterns that helped me understand what was really happening. So, with these new-ish ideas coming around, it seemed like a great learning opportunity to become better acquainted with more the lower level aspects of web development.

While I have a few projects laying around my home directory for each of the ideas mentioned above, this one is specific to Diesel. I'm somewhat partial to for social reasons. When thinking of an idea for something that might very well benefit from being asynchronous (that is not a chat server), I came up with the idea for a database hub.

The idea for Repeater is that you set up a proxy that will "repeat" the requests to a set of services. For example, if you were using CouchDB, you could set up Repeater, which would be a round robin proxy for the different instances and allow updating all the instances with one request. This doesn't work yet! But that is the idea.

What does work is a basic proxy that will balance between some set of services. It works pretty well from the stand point of basic tests not failing, but I have no idea if it would be extremely slow in practice. The thing that got me stuck is how to handle more requests without blocking when requesting the other site. From the code, you can see where I started messing around with threads to make sure I don't block. The gotcha in all this was that I couldn't figure out a way to effectively yield a sleep or join where the join is when the thread has finished making the request. I'm sure there is a way to do it, but my experiments were not very fruitful.

My overall impressions are pretty good. I'm still a bit hazy on the applications where an async system is an order of magnitude better than a threaded or forking (or both) system. The chat example seems obvious because you have a direct relationship between a client connection and a single process that needs to make responses in order. This is similar to a database connection in that it may need to handle a large amount connections, but at the same time, as soon as you start doing things like requests over the web, it seems inevitable that blocking will be an issue. Although, it seems very solvable.

Loading mentions Retweet
Filed under  //   programming   python  

Noticing Small Features

We recently returned from a tour. While on the road, I'm still working. I have a slick bean bag chair, curtains and MiFi router, which makes it possible to get my work done while cruising down the freeway. This is something I've discussed before.

My two recent annoyances with my setup are not having an automatic way of completing an email address and slow updates to the server. So far, the slow updates will probably involve setting up a local IMAP server to sync to and from. It is something that I've started looking into, but haven't spent that much time on it. It ends up being easy enough to just ignore email and save the time. What is rather annoying is having to enter email addresses.

It is not that bad, but it is relatively time consuming. My typing could be much better and it can be easy to make mistakes. Also, there are many times I'm copying the email address from some other resource such as a web page or IRC, in which case, there is a lot of buffer management to deal with. Again, not a huge deal but enough to warrant looking into solutions.

Fortunately, like most things in the Emacs landscape, there was already a good solution in place. The BBDB or the Insidious Big Brother Database. I initially tried using the tips mentioned on emacs-fu but found it didn't work. Fortunately, I found the solution was incredibly convenient and directly in the Wanderlust docs. The result, is now I'm collecting email addresses for completion much like I would get in gmail in addition to having an actual addressbook I might consider using. Pretty nice ROI for googling a bit.

This is realization you need a feature is a pretty common occurance in Linux. The conceptual basis is usually there to create a solution, but often times it takes a bit of work to really get things configured. Emacs is very similar, with the exception being it is almost expected that the configuration might very well be writing a new mode or piece of functionality yourself. It serves as a good reminder that I shouldn't go recommending Emacs to my parents anytime soon. Although it would be pretty cool if I did get a call from my dad asking about setting up a keybinding for some lisp function he wrote.

Loading mentions Retweet
Filed under  //   emacs   programming  

Parallelism Relevance

The other day I got rather caught up in the whole * is Unix discussion and wanted to get a better understanding of some of the more low level details associated with writing servers. Eventually, I had a crude version of CherryPy that forked instead of used threads. The benchmark in CherryPy seemed to suggest it was a great deal faster, but honestly I think that was a fluke, especially considering I'm pretty sure there are some serious deficiencies in terms of process management. When Bob posted his forking example on the CherryPy list and mentioned signals, it was clear that I was missing some pretty important understanding.

One thing I thought seemed helpful was that it seemed to use both processors on my system. My understanding is that calling fork will create two separate processes, which effectively should place at least some of the forked processes on different processors. My understanding is that this is how the multiprocessing module works (on Linux at least). With this new cursory knowledge under my belt it was interesting to see an article in my ACM Communications magazine remark on the future of computing and the need to handle multiple processors in a similar manner as sequential paradigms. I should note that this is one of the very few times there was anything remotely interesting to me in my ACM magazine, so I felt compelled to check it out. The idea is not new to me and while the arguments that this is a huge issue is pretty valid, I also started thinking that maybe it is not as big a deal in practice.

The reason for this is the web. Back in the early days of computing people had dumb terminals and logged into a server where all the actual "computing" was performed. While thick clients have been the rage for quite a while, the web is effectively becoming the mainframe in the sky in terms of where people are doing their computing. This should be very helpful in making the jump to parallel computing because there s a definite history and set of tools that have been developed to scale that allow parallel processors. In other words, who really cares if your computer has 2 quad core processors when your searching email online, browsing facebook, checking your bank statements and reading the news through your web browser.

On the web, what really matters is that the sites doing the processing have optimized their architectures to handle the load. The other side of the equation is the browser, but in reality, this is an area that is already becoming more robust. Google Chrome is a good example with its separate processes for each tab, but I would even argue that Javascript is well suited to handle distributed tasks. We've seen a ton of articles on using async servers. Javascript is already using an async model that is only getting faster with the recent developments in new Javascript engines. None of this means that parallel computing and utilization of more processors is not very important. I just don't think it is quite as critical as some might think. That doesn't mean programmers shouldn't try to understand it. After all, it is a hard problems and programmers typically like hard problems. My whole point though, is that the problem might be more of a fad than an actual crux for the IT industry.

Loading mentions Retweet
Filed under  //   programming   python  

The Perfect ORM

Some folks are tired of ORMs and raise a good point regarding exposing SQL generation in ORM libs. Most of the time when I use an ORM, there is a moment where I wish I could just get on with things. The declarative approach for creating tables usually is one hurdle that annoys me. Another is defining relations, since it seems so trivial to do that sort of thing in the query. At the same time, I'm far from a heavy database user and can imagine my "it's so simple" mindset quickly deteriorating to massive complexity without much work. Though, I always did like the web.py db layer. It pretty much gave you queries and returned the results in objects. Pretty simple stuff. I think it also let you update those objects and that would be persisted in the DB, but I could be wrong about that. At one time I tried to use it on its own but didn't have much luck. One gotcha is that the db layer depended on running within web.py and had expectations for keeping a threadpool and the like. Thinking back, it could have been very helpful if web.py's db layer did provide some helpers for writing SQL. Based on the article and the comments, it sounds like SQLAlchemy is already able to expose the query mechanism. I'm not sure how this works within the scope of providing model objects, but it sounds pretty promising nonetheless. Likewise, Dejavu, a personal favorite excels an improving queries via translation from language constructs. The NoSQL movement is partially exciting because instead of thinking about relational algebra you think about looping and applying functions. Dejavu uses the same ideas and exposes the Python to SQL translation with its Geniusql lib. Even though I don't really use databases very often, it is interesting to see how people are handling the progressive issues of using an ORM in the real world. Just as databases aren't dead, I believe ORMs are still very important and will most likely become even more important. There is a ton of well tested boilerplate code in ORMs that make creating models much easier. My guess is that even in a world full of NoSQL, we'll still see ORMs help out when it comes to keeping tabs on domain objects. What we won't see is one ORM that supports every database or persistence layer known to man, although there will be people who try to write it.

Loading mentions Retweet

Just Checking In

Today I had the desire to write something down, but really didn't have a concise idea of what to write about. So this post is just going to be a small summary of some thoughts and experiences.

Free Software and Open Source

I recently read the RMS opinion on Codeplex and Miguel's response. After a quick glance over at planet gnome I noticed a few people taking sides and it occurred to me that the whole argument is rather silly. When I was in college the concept of free software made a ton of sense. Looking back it was because I didn't have any money, so generally anything free made a ton of sense. Now that I'm a full fledged tax paying adult, the glamour of free software has lost its glitz. It is not that free software has become unimportant or useless. What has happened, in my mind at least, is the arguments associated with free software have become rather stale. By stale, I simply mean it isn't anything exciting for me personally. I think free software is critical, but I have better things to do than care about it in its own right. I'm probably just getting old, but it was an interesting realization for me nonetheless.

Test Driven Development

At work I've been trying to improve my tests. By "improve" I really mean write them in earnest. It is a really difficult thing to write code using TDD. It is a similar approach to modeling in that it forces you to consider an abstract idea of what some code should do and look like. TDD is sort of like UML in the age of Ruby on Rails, which is kind of funny as the recent web frameworks and NoSQL all suggest rapid prototyping over planning before coding. While both UML and TDD are doing pretty much the same in terms of hashing out code, the obvious benefit of TDD is that you get something that can be used in the future. At the same time, a well tested code base is not that important if the tests are bad and are hard to run. Testing in web browsers is the most obvious case in point. The larger point then is obviously that planning, whether through tests, visio or some hodge podge of tools, is helpful for writing better code. It might also be argued that it is faster since the design is fleshed out to some extent, but I would ask if the time spent planning is included in that calculation and if it is a real calculation at all. Programmers have a nasty habit of estimating because of the constant requirement to create hypotheses in debugging. My bet is that many of the virtues of TDD (like UML as well) are overblown and the only real benefit is forcing a developer to focus on what the problem is. One of my issues is that it creates a whole new class of code that deals with testing. This is totally fine, but where are tests for the tests! It seems like a story that we'll probably never see the end of.

Text Browsing

I'm going to suggest that if you're a programmer, it would treat you well to try out a good text web browser. My recommendation is w3m due to its Emacs integration, but anything that can keep you in your work environment works. My guess is vimmers would get similar usage from links/lynx assuming the terminal is their environment. The reason being is that if you are constantly editing text and reading it in your dev environment, browsing the web textually can be a helpful tool to keep focus. For me, I get the same keybindings, easy copy and pasting, and simpler window/frame/buffer management. Beyond this though, it feels faster when it comes to reading documentation and finding helpful code. Your milage may vary, but it sure couldn't hurt to try.

Administering Systems

At work we recently rolled out a new system. It isn't actually new, but is in fact the latest step in an improvement to a current system. What always strikes me about the smart folks I work with is how gracefully they walk the line between system administrator and programmer. The two fields are completely intertwined, but the best programmers are those that have the better understanding of both sides. This is probably partly why I'm not that great of a programmer! For whatever reason, my mind doesn't ever seem to really indulge in the system administration side of things. It is always a challenge for me to make pre-existing software work the way I think it should. That doesn't mean I'm not trying of course! But it does mean that I have a ton to learn and will for the foreseeable future.

ACL Wrap Up

This past weekend was ACL in Austin and it was a blast. We saw Them Crooked Vultures, The Walkmen, School of Seven Bells, Broken Social Scene and some guys from Phoenix DJ. We also played a show with The Riverboat Gamblers and The Soldier Thread. It was a ton of fun. On the road we don't really get to hang out that much. There is usually somewhere to drive, something to load or unload or something to sell that usually keeps us busy. It was great to come home, rest and then have a great weekend of music and friends. We didn't go to ACL proper and I'm glad. There were plenty of bands I would have liked to catch but the weather was horrid and my guess is I would have been pretty miserable in the mud. Hopefully next year there will be some nicer weather. Who knows, maybe we'll even get to play!

Loading mentions Retweet
Filed under  //   emacs   music   programming  

ACL Thusday

   
Click here to download:
acl-thusday-knEdFHjqxxnzphvBBDoC.zip (763 KB)

We got into Them Crooked Vultures for free.

Loading mentions Retweet

Finally Using Emacs for Web Browsing and Email

It has taken a long time, but I've finally managed to get email working relatively well in Emacs! In the end Wanderlust was the client of choice for me. It has a pretty simple file format for organizing mail and the keybindings have already begun to be rather natural. The reasoning behind my quest for merging my email and emacs had more todo with my recent tour than an actual desire or need for checking mail in my text editor. I was perfectly happy with Gmail and in fact I'm technically still using it through IMAP. The real reason was that I wanted to try and reduce my bandwidth usage. On tour I had a 5 Gig cap on my network usage. Having never really monitored this sort of thing before, I had no idea how much bandwidth things like IRC might use when you're talking about every day, all day usage. To make things worse, I forgot to grab the USB cable that would let me check my current usage for the month, so I was essentially flying blind!

My fear of an enormous bill made me consider what I could do reduce my bandwidth. The obvious avoidance of listening to music and watching videos was easy. Also, reading local versions of docs was another helpful tactic. This lead me to see a few things about w3m, which I had tried to use before in Emacs. It dawned on me that things like code examples viewed in a text based web browser makes nothing but sense in my text editor. It makes things like copying and pasting code samples a breeze. After getting used to w3m in Emacs it became clear that I should really try to finally tackle email. As I said, Gmail has been just fine, so the motivation wasn't very strong to move. The killer feature in my mind was tying things into my todo list and avoiding Gmail's constant stream of Ajax requests. I'm sure the bandwidth savings are practically nothing, but nonetheless, it seemed like if I got things working it might really be helpful in ways similar to w3m.

Once things were configured and I managed to get everything working, I did notice some helpful bits. Silly annoyances that really should never have been a problem smoothed out. I'm talking about basic copy and paste issues I noticed here and there when working with Emacs and the rest of my desktop. Little things like working with Trac was also a good deal faster, although it was something of a let down that I couldn't submit updates to tickets. Finally, it seems more reasonable to keep up with mailing lists as most of the necessary content (code) is close at hand. This last aspect seems to be the biggest draw of using Emacs. I primarily write code and keeping more tools inside Emacs makes it faster to transition between those tools and the code I'm working on. Of course it does seem kind of cool in a geeky way to work almost solely in text, but the main nicety is actually being more productive. I'm hoping in the next couple weeks things can continue to become a little more streamlined. The killer app that I would like to find a good Emacs replacement for is my feed reader. Google Reader is great but something more bare might help to move through things faster and generally filter out the cruft. That said, blogs were one of the first things I cut out in my quest to lower bandwidth. My desire to check what is new in the blogosphere has greatly diminished since taking a break. I'd like to potentially keep it that way for a while.

Loading mentions Retweet
Filed under  //   emacs   programming