Sebastian Slutzky's blog

Monday, March 14, 2011

Notes from QCon London 2011

Notes from QCon London 2011

Did you know Facebook is n ordinary single-threaded PHP website? How complex do you think Twitter’s business logic model is? Do you know how Nokia is preparing to scale Ovi once (if) their Microsoft deal is agreed?

You may already be familiar with InfoQ, but if you are not, I recommend you to take a look at it at www.infoq.com. InfoQ is an online community focused on innovation in enterprise software development. On a daily basis, they deliver high quality articles for IT communities such as .Net, Java, SOA, Ruby and Agile among others.

QCon is the annual event organised by InfoQ, where experts in each of these diverse areas provide an update on the latest lessons learned, discoveries and challenges ahead. I am just back from QCon 2011 in London, where I had the opportunity to listen to some of these experiences and to get an idea of what is being done in other companies and countries.

Is not often that you get to talk to the people behind architectures such as Twitter, Facebook, Nokia and Guardian.co.uk among others, so I want to share some of the good stuff of QCon here.

Today I’m starting with an overview of the chats I assisted to on the first of the three days of conference . I can’t guarantee I’ll remember (or I even grasped) all I listened to, but I can at least give you resources and point you to the right documentation. I also plan to write detailed posts on a few topics I found most interesting.

Day 1

All presentations were held during three days, and categorised into five areas or tracks per day. Unfortunately there is only one me so I had to select carefully …

One of the tracks of Day 1 was “Enterprise Agile Transformation”. Being an Agile geek, I had to force myself to diversify, which was as difficult as getting a dog to eat a balanced diet.

Scaling Lean & Agile: Large, Multisite or Offshore Delivery

Craig Larman has a vast experience on implementing SCRUM in very large-scale project. With large I mean 500-1500 person, multisite teams with clients such as Xeror & Alcatel-Lucent. From these experiences he shared lessons learned, patterns and anti patterns on how to adopt and maintain Agile projects with multi hundred person teams, both co located and off shore.

Larman is also the author of the books “Thinking & Organizational Tools” and “Practices for Scaling Lean & Agile Development”

This was the key note of the day what means that a bit of time is wasted on bad jokes, but it gave us an idea on how we can keep a high level of communication and transparency despite teams being very large or non- co-located.

The Invisible Computer Lab

I am still not sure why I choose this presentation, but it turned out very interesting. Fraser Speirs is an iPhone developer who is also the head of Computer at a School in Greenok, Scotland . He told us how his school provided all children with iPads as the main learning channel. He also shared with us what where the students, teachers and parents reactions, his view of the overall results of the iPad implementation, and how he foresees the future of technology in the classroom.

Contracts & Collaboration in Agile “Offshore” Outsourced Development

Again Craig Larman (I told you I struggled to diversify this day!) deep-dive on how to scale Agile, in particular with off shore outsourced teams. There were a few useful tips and tricks on how to deal with offsite teams without losing the high level of transparency required by Agile methodologies. From cultural differences to contract-model choices, this was no high level, abstract Agile talk, but a very hand on and pragmatic one instead, including tips as low levels as what video chat software to use in daily SCRUMs .

"Bringing developers and testers closer together with Visual Studio"

The “Solutions” track, the only one that run over the three days, was the showroom for commercial products and frameworks. In this context, Gile Davies from Microsoft UK presented Test Manager, a new application for collaboration between testers and developers.

I have to say I was extremely impressed by this tool, which is an extension of Team Foundation Server. The amount of help it provides to testers and developers during the lifecycle of bugs is what surprised me:

- Testers can record their UI interaction within Windows and store them (okay this is not all that impressive yet…)

- They can modularised these UI steps into logical blocks (e.g. “Find a book to buy”, “Add it to the cart with IE 7.0”, ”Cancel order using keyword only”, etc)

- All test steps gets auto appended to Bug description.

- Developers can click on the logical block that failed. They can even debug the problem as of the execution performed by the tester (yes!, it saves the entire state of the test that failed, so developers and tester are always talking of the same environment!)

- It can even take automatic snapshot of cloud environments (we are talking Windows Azure only, of course)

Complex Event Processing: DSL for High Frequency Trading"

Richard Tibbets presented the system his company build, StreamBase. It is an engine for Complex Event Processing Domain Specific Languages, complied into JVM code. The common perception of DSLs is that they are very expressive but that they cannot perform in scenarios where speed is at premium, like front-office trading applications.

Unfortunately, I wasn’t able to dissipate my objections to DSLs. Ironically, despite all the talk about speed I felt this chat so slow it was never ending.

"Where Did My Architecture Go? Preserving software architecture in its implementation"

Agile systems are evolutionary, which means that the design of the system is not fully detailed up front. Instead, it “emerges” from a series of cycles of iterative development. Now for those who are not used to evolutionary design, this may sound like anarchy or worse, like no design at all.

Eoin Woods did a very convincing walk through a set of tools that allow system to generate their representation from their implementation. This allows teams to keep their architectural design updated, so that the big picture is always clear as the software evolves. Some of the tools I knew already, but there were a few others that I’d really like to try in my next project. The tools demonstrated included support for .Net, Java and C++ technologies.

More to follow…

As I mentioned earlier, I will follow this post with other two with the presentation I’ve attended to on Day 2 and 3. After that, I will write a detailed post on some of the talks I’ve found most interesting.

And remember; just shout if you want more info on any of the chats I just mentioned. My brain is entering entropy soon so I don’t guarantee to keep this stuff around my head for a long time.

Monday, April 19, 2010

Finding duplicated files recursively

I was cleaning up a folder with files duplicated in subdirectories. Here is a simple Powershell code that identifies duplications recursively.

There might be a shorter form, but I think this is as I short I can get without becoming very cryptic.

function FindDuplicates
{
  #Receives a list of ArrayLists
  # iteratively prints the full path of each ArrayList item
  # where the ArrayList contains multiple paths (i.e duplicates)
  Process
  {if ($_.Count -gt 1)
    {foreach($d in $_){$d.FullName}}
  }
}

function GetFilesInBuckets($dir,$filter = "*.*")
{
   # Creates a hashtable of arraylists (1 entry per unique file name)
   # Returns the list of arraylists only (not the entire hashtable)
   $files = @{}
   foreach ($fl in (gci $dir -filter $filter -r))
  {
        if(!$files.ContainsKey($fl.Name))
         {$files[$fl.Name] = New-Object System.Collections.ArrayList}
        [void]$files[$fl.Name].Add($fl)
  }
  $files.Values
}

GetFilesInBuckets "C:\MyRootFolder" -f "*.txt"|FindDuplicates

Additional filtering methods will look similar to as "FindDuplicates", and can be piped at the end of the last line of code:

GetFilesInBuckets "C:\MyRootFolder" -f "*.txt"|FindDuplicates|FilterOutOlder

Friday, April 16, 2010

NAnt and the limitation of XML programming

I have used NAnt (the .Net port of the Java build tool Ant) in most of the .Net/VB6 projects I have worked on. The decision for using this tool was not always mine, but I guess I did not find (nor look for) a better alternative.

As these projects grew, so did their build process and the complexity the build scripts. NAnt was flexible enough to cope with this. If a suitable task does not exist, you can download NAntContrib tasks, other tasks from the web, or you can create yours. If a task is too simple a language construct for the job, you can call an executable and pass the right parameter, create a .Net in-line function, or call a Powershell script.

One of the systems I worked on was particularly complex but very well designed. So was its build script. Common functionality was not duplicated, but kept as targets or tasks in separate NAnt files, that where included in the main NAnt script.

When reusing these blocks of code, you usually have to pass context-specific parameters to it. You do this by setting properties before invoking the NAnt target. For example:

...
<property name="SpreadsheetPath" value="${Config} + ${CurrentProject}" />
<call target="ProcessSpreadsheet" />
...

This syntax becomes more awkward when more properties need to be passed to the target.

Worse, the lack of explicit input and outputs to a NAnt target make this syntax a bad practice: code calling a target needs to know the name of the properties the target needs set, and the name of the property containing the expected output of the target. Let me picture this with an example:

<!—Known inputs: SpreadsheetPath and BusinessArea-->
<property name="SpreadsheetPath" value="${Config} + ${CurrentProject}" />
<property name="BusinessArea" value “Security" />
<call target="GenerateCodeFromSpreadsheet" />
<!—Known output: NumberOfGeneratedFiles -->
<property name="TotalCodeFiles" value="${NumberOfGeneratedFiles}" />

The problem: When XML is not enough

I don’t have the intention (nor the knowledge!) to enter the XML vs DSL discussion but I see a real disadvantage in using Ant/NAnt for big and complex systems. Generally speaking is not a good idea to program in an XML based DSL (like NAnt). It makes sense to declare targets that know what to do and how to do it, but as the code base grows and scripts are modularised for better reusability, then is only a matter of time until you find yourself programming on XML.

The XML/DSL discussion is a very interesting one, and so is the classification of programming paradigms ( what is NAnt programming. Goal-Oriented? Declarative? Or simply a markup language?), but I’m not interested on them now. I think I’m done with Ant/NAnt…I’ll stop trying to extract oil from stones, as Spaniards say. There’s only so much you can ask xml-based languages, but soon you’ll need a proper programming language, or you’ll become an xmlholic.

If you know Maven for Java solutions (NMaven is only an idea at the time of writing), you know this is an XML based tool too so it will suffer the same problems I just mentioned. Maven is, however, more flexible thanks to the life cycle structure and the ability of recursively build projects (as specified in their Project Object Model xml files) in a folder structure. This benefit is not enough to marry xml programming for life...one size doesn’t fit all, in fact in terms of build process; standardisation is the exception rather than the rule. I still need to look into Gradle, but it may be good answer to Ant/Maven approaches.

The solution:

I don’t know really. Sorry if you’ve read this far, but I just wanted to brain dump my discontent with XML programming. Maybe is a useful question for us developers using or building frameworks on a daily basis. Where should we allow for a more multi-purpose, less templated approach? Extensibility is not always the answer. Some frameworks like NAnt are very extensible, but offer very little at their core. The extended functionality is more useful than the basic framework itself.

So no, I don’t know the solution for avoiding XML programming (NAnt or Maven) for the time being I’m moving towards a build script fully written in PowerShell, with a flavour of Maven’s life cycles. I will write in detail about this later, it is not rocket science and is not a full solution, but is a start, just like this blog entry.