Come see my unit testing tutorial at PF Congres or PHPNW!

Exciting times! I’ve been invited to come and deliver my 3-hour tutorial “An introduction to writing unit tests using PHPUnit” at two more conferences: PFCongres (Utrecht, NL) next week, and at PHPNW (Manchester, UK) in October. I’ve given this tutorial once before at php[tek] and I’ve presented it as a 45 minute talk at PHPUK and at the inaugural PHP Amersfoort meetup.

The main reason why I started doing this talk is because I discovered that a lot of people are struggling when they want to start writing unit tests. I remember this from when I wanted to write my first test. Sure, I understood the schoolbook examples, but how do I start? Where do I put my tests? How do I run them? And the biggest problem of all: my project doesn’t look anything like the schoolbook examples! Your project probably has a lot of dependencies, complex methods and a lot of links with external systems that make the code almost impossible to test. Not to worry: Harrie is here to help :-)

The thing I really value about those 3-hour tutorials is the possibility for a hands-on approach. Please bring your laptops and write the unit tests along with me, because I think that people learn most and remember best from actually doing it. We will start off with the basics, like where to put your unit tests in your project and how to run them. We will then gradually make it harder for ourselves; you will learn how to write testable code, how to deal with dependencies in your code and I will explain how you can start applying these principles to any project you’re working on tomorrow!

This and a lot more. If you’ve been planning to start testing your code but you never got around to figuring out how to do it, I hope you’ll consider attending my tutorial at one of the upcoming events:

PFCongres tutorial day
Friday 13th September – Utrecht, NL

PHP North West Conference – tutorial day
Friday 4th October – Manchester, UK

Also, make sure to check out the rest of the schedule – because both conferences are packed with a lot more great content! Hope seeing you all in either Utrecht or Manchester!

Moving on

This news probably isn’t new for all of you since I already announced it on twitter and facebook earlier. However, in case you missed it I decided just to write a small blog to let you know that I’ve decided to move on and leave Ibuildings. I’ve always really enjoyed my time there; the talented people, high profile projects and great atmosphere makes it into one of the best employers a developer could ever wish for. However, I’ve been working there for almost eight years now, and I felt more and more like time had come for something completely new. There’s so much more of the world to see!

So as per 1st August I’ve started at bax-shop.nl as a senior PHP developer. Bax-shop.nl is the largest webshop in the Benelux for sound, stage and studio equipment and is expanding like crazy. My first assignment will be to help the rest of the team to build the new Magento-based website that should soon replace the existing one. After that there’s still a lot of other work to be done, and I hope that with everything I’ve learned during the last decade I can really make a difference.

So exciting times! I’m really looking forward to get going. And don’t worry, I’ll still be speaking at conferences and user group meetings liked I used to do. For example: in September I will do a talk on Varnish at PF Congres in Utrecht, and in October I will do my new and improved “Recognizing smelly code”-talk at PHPNW in Manchester.

Oh, and yes I am aware that I should blog more often. I was actually holding back blog ideas because I wanted to wait for my new website to be finished – but it’s at the legendary “80% done” status for almost a year now, so maybe I should just continue using this one for now ;-)

DB migrations: rename instead of drop

Reverting a dropped column

In my talks and article I always mention that when it comes to database migrations it is generally a bad idea to rely on undo-patches for rollbacks. As an example I always use the same story: “Imagine you write a database patch that removes a column from a table, and then you write an undo-patch that adds the column back again. Sure, your database schema is now the same as it was before – but the content is gone!” And then I would move on by saying that you probably shouldn’t rely on undo-patches anyway for rollbacks, because it is better to thoroughly test everything before updating so you are absolutely certain that you won’t have to rollback. Also, I recommended to make backups first, and use that to revert database migrations when needed. Easier said than done!

The fact that I am in the luxurious position where this approach usually is an option, doesn’t mean that this is a solution for everyone. For other projects where, for example, larger databases are involved, rolling back by restoring a backup would be considered highly inefficient (and if the database is in use it would actually result in data loss). Being able to rollback patches can be convenient, but reverting dropped columns can’t be done off course.

When I did my talk at DPC10 in Amsterdam however, someone in the audience (sorry, I don’t remember who it was) suggested a very simple but very effective solution. In fact, it was such a simple solution that I couldn’t stand I hadn’t thought of it before: If you rename your column or table first instead of dropping it you can still just use an undo-patch to revert the change. In a later stage, when you are absolutely certain you don’t need to do a rollback anymore, you can simply drop the renamed column or table by writing another patch. For example in a next release.

Reverting a renamed column

As you can see, sometimes the simplest solution is the best. Beware though, as in some situations this approach might still cause problems: Records added to this table between the moment the patch was applied and the moment the undo file was applied won’t have a value in the age column, to name just one. As always when we are talking database version control, it all depends on the specific situation you are in.

I myself still refuse to write undo-patches as most patches create new tables or columns anyway, instead of removing them – and if patches actually do drop tables or columns they probably haven’t been very significant for quite some time anyway. Maybe the main reason is that I have simply never found myself in the situation where an undo-patch would have helped me. This might be different for you though, and if you are looking for a way to rollback dropped columns or tables, renaming them first might be a pretty good idea.

Benchmarking Xdebug

A very popular and widely used PHP extension for debugging and profiling PHP applications is Xdebug. Normally this tool would be used on your development environment, or in rare occasions on some remote environment that is similar to the production environment. However, a little while ago one of my co-workers found a production environment running Xdebug, and we wondered how severe this was influencing performance. Since you are adding extra overhead it is very likely things will run slower. But how significant is this difference. Would it be something worth considering?

Disclaimer
Now I know that debugging a production environment is generally considered unnecessary and a really bad idea. I myself can’t really think of a situation where I would actually want to have Xdebug on my production machine – but that’s not the point. I just had a new question that I wanted to answer: “How much does installing Xdebug affect the performance of PHP?”

The hardware
To be able to test this I first needed to find a proper testing environment. Unfortunately I don’t have any webservers laying around, and testing directly on my macbook would probably be a bad idea – since there are a million processes running that all influence the performance of the machine, and therefore would influence the results. After some pondering I realised that I had an old (well… more like ancient) laptop laying around that would actually suit this purpose quite well. When it comes to speed it doesn’t come anywhere close to a modern production environment – but since we will be looking at relative changes rather than absolute measurements I figured it would be just fine.

The software
I downloaded and installed a fresh new copy of Debian. Once it was installed I decided to take the easy way and install PHP simply by using apt-get install php5. Finally, I installed xdebug using the PECL installer. With just this installed and nothing else, I was convinced my benchmark results would be quite reliable.

Software installed:

  • Debian 5.0.8
  • PHP 5.2.6 with Suhosin-patch
  • Xdebug 2.1.0

The benchmark
Why write a benchmark when they already exist? For testing purposes I decided to use the bench.php script that comes with the PHP source. This script does not run out-of-the-box when Xdebug is installed, because one test (Ackermann function) exceeds the default Xdebug limit of 100 nested function calls. Now I could of course increase this number, but I wanted to test how Xdebug performs without any modifications to the configuration – so I decided to simply skip this one test and just run the remaining ones.

Finally, after a lot of preparation, I was ready to run the script. \o/

The results
I decided to run the script three times with the xdebug extension enabled, and then another three times with the xdebug extension disabled. The results were pretty clear:

No Xdebug
(time in seconds)
Xdebug
(time in seconds)
Run 1 Run 2 Run 3 Run 1 Run 2 Run 3
simple 0.687 0.648 0.681 1.327 1.361 1.367
simplecall 0.813 0.744 0.786 3.581 3.770 3.875
simpleucall 1.138 1.225 1.169 4.439 4.510 4.877
simpleudcall 1.289 1.469 1.336 4.621 4.609 4.747
mandel 2.016 2.259 2.161 3.646 4.028 4.036
mandel2 2.774 2.753 2.766 3.495 4.008 4.215
ary(50000) 0.171 0.175 0.171 0.210 0.224 0.224
ary2(50000) 0.136 0.136 0.136 0.172 0.180 0.184
ary3(2000) 1.363 1.365 1.416 2.133 2.463 2.450
fibo(30) 3.694 3.517 3.538 10.418 12.364 12.943
hash1(50000) 0.212 0.216 0.214 0.515 0.618 0.638
hash2(500) 0.264 0.266 0.266 0.283 0.357 0.342
heapsort(20000) 0.676 0.667 0.671 1.026 1.198 1.196
matrix(20) 0.556 0.552 0.555 0.810 0.960 0.936
nestedloop(12) 1.221 1.208 1.311 2.197 2.570 2.573
sieve(30) 0.786 0.786 0.789 1.105 1.231 1.226
strcat(200000) 0.090 0.093 0.096 0.148 0.173 0.169
Total: 17.886 18.079 18.062
40.126 44.624 45.998
Average: 18.009 seconds
43.583 seconds

So there we go. Without Xdebug it takes about 18 seconds to run the script, and with Xdebug enabled is takes about 44 seconds: 241% of the original time (or: it took 2.41 times as long to run the script).

So what does this mean?
In this situation adding Xdebug slowed down the execution of the benchmark script quite significantly.

So what doesn’t this mean?
This doesn’t mean that your application will slow down with the same relative percentages, nor does it mean that you can compensate for this by buying 2.41x more servers. When we look at the breakdown per test it’s clear to see that the differences per test are quite significant as well. For example: the time needed to run the ‘ary’-test increased with only 27%, while the time needed for the ‘simplecall’ test increased with a whopping 379%.

Apparently the overhead caused by running Xdebug varies and is related to what kind of operations you are running. Calling functions seems to be affected a lot by Xdebug, while the effect of manipulating strings or arrays is less severe.

Test No xdebug Xdebug Percentage
simplecall 0.78 3.74 479%
simpleucall 1.18 4.61 391%
simpleudcall 1.36 4.66 341%
fibo(30) 3.58 11.91 332%
hash1(50000) 0.21 0.59 276%
simple 0.67 1.35 201%
nestedloop(12) 1.25 2.45 196%
mandel 2.15 3.9 182%
strcat(200000) 0.09 0.16 176%
ary3(2000) 1.38 2.35 170%
heapsort(20000) 0.67 1.14 170%
matrix(20) 0.55 0.9 163%
sieve(30) 0.79 1.19 151%
mandel2 2.76 3.91 141%
ary2(50000) 0.14 0.18 131%
ary(50000) 0.17 0.22 127%
hash2(500) 0.27 0.33 123%

A more realistic setup – Testing WordPress with ApacheBench
The numbers from the previous test are quite clear – but they hardly represent a real world example of an application. To get a bit more insight in how enabling or disabling Xdebug would affect an actual real life application I decided to add Apache, MySQL and WordPress into the mix, and benchmark the default homepage of a standard WordPress installation using the benchmark tool ApacheBench (ab), which comes with Apache. Adding these applications means there will be a lot more factors to influence the results – but I was hoping the results would somehow give an indication on how much an average application would be affected by installing Xdebug.

I ran ApacheBench a number of times. Every time it fired a total of 100 requests and every time I changed the number of maximum concurrent requests. I assumed that when Xdebug was enabled the requests would take longer – but because of that it’s also more likely the maximum number of concurrent requests that’s still acceptable is reached sooner. After a couple of test runs I had the following figures:

Max. concurrent requests No Xdebug (req./sec.) Xdebug (req./sec.) Difference (req./sec.) Difference %
1 2,79 2,22 0,57 20,43%
5 2,75 2,16 0,59 21,45%
10 2,73 2,13 0,6 21,98%
15 2,69 2,11 0,58 21,56%
20 2,64 2,1 0,54 20,45%
25 2,61 1,34 1,27 48,66%
30 0,95 - timeout - 0,95 100,00%

or in a pretty graph:

So not only does it take about 20% more time to respond to requests in normal situations, the maximum number of requests your server can handle is also influenced significantly.

Conclusion
So as a conclusion we can say what we actually already suspected, Xdebug influences performance quite a lot. How much exactly is hard to tell and depends on what your application does. Xdebug is a great tool, and pretty much irreplaceable when you are looking for a proper tool to debug or profile an application on your development- or testing environment. Running it on your production machines however is something that would only make sense in very specific situations, as it is typically unnecessary and it slows down your application a lot. Also, make sure you turn off Xdebug on environments that you want to use for benchmarking your application – as it will influence your results.

Edit: Directly after publishing this article Xdebug mentioned on twitter that the current version in SVN is supposed to be a lot faster. I’ll benchmark this new version soon – and let you know if there is any significant different.

Organizing the Dutch PHP Conference

Dutch PHP Conference 2011
Some of you might already have noticed, others might not, but this year I am part of the team at Ibuildings that is responsible for organizing the DPC: the Dutch PHP Conference. As you can imagine I am thrilled about this! It is hard to believe that a couple of years ago I was just a visitor at the DPC, visiting one of my first conferences, and this year I am actually helping to put it all together!

Of course, I could not possibly do this just by myself. Just imagine what needs to be done: the website needs to be built, the talks need to be selected, the speakers need to be informed, the tickets sales have to be taken care of, the venue needs to be booked, the flights and hotel rooms for the speakers have to be arranged, the social event needs to be organized. Etc. Etc. The list just keeps going. Organizing the conference is a huge task and a lot of people at Ibuildings are helping to organize it. Luckily most members of the team are pretty experienced by now, as they have been involved with organizing the DPC from the very first edition. It is comforting to have such a solid base to rely upon.

I myself will be mainly working together with my colleague Felix de Vliegher. Together we are responsible for what I think is the most fun part of the entire operation: creating the conference schedule, communicating with the community and communicating with the speakers. We will also host the conference, which I think is pretty exciting. Yes I have spoken to groups before. Quite large groups actually. But this time I will be speaking to a fully packed room on the main stage. Cool! I suppose it is much like doing a talk – only this time I will not be talking about database version control, but about what kind of sunglasses we found on the floor of room B and when you will get your food.

At the moment we are very busy selecting the talks and creating the conference schedule. Together with a couple of colleagues we are trying to figure out which talks are the best ones. Not an easy task when you can only pick about 30 talks from 240 great proposals! I really enjoyed reading them all and it is interesting to see all the different subjects and ideas people come up with. It is sad that we will have to disappoint most of the submitters – but having them all would result in a 16 day conference or something, which would not be very realistic either now would it? :-)

TechPortal article on database version control

techPortalLast year I spoke at different conferences throughout Europe about database version control. However, a while ago I decided that I did the talk often enough and that it’s time to move on. Therefor I wrote a big wrap-up article that summarizes everything I told (and learned) during these events. I’m proud to announce that this article was published on ibuildings’ techPortal site today!

You can find the article here:
http://techportal.ibuildings.com/2011/01/11/database-version-control/

Why my analogue year calendar rocks

I think most of us don’t have analogue calenders these days. Neither did I: for both my personal and my work calendar I use Google Calendar. My phone shows appointments in these calendars (Anroid built-in functionality) and I can set reminders on my phone for appointments I don’t want to miss. Useful, because I have my phone always with me, so I never miss anything. The same goes for my laptop. On both devices I can toggle personal and business appointments on or off based on what I want to see, and most importantly: everything is in one place. An absolute must from a time-management perspective.

The 2011 year calendar hanging next to my desk

Therefor I was a bit skeptical when Jeroen van Sluijs, one of my colleagues, told me about his analogue calendar. He showed me, and it was a simple piece of paper with the entire year calendar printed out on it. How on earth can this be useful when I have all this fancy technology to do the job? Nevertheless, I decided to give his calendar a try. I printed the calendar on a piece of paper and put it next to my desk. I used a text marker to mark all official holidays, as well as “days off”. The basic rule here is: pink means day off – you get to sleep out. During the year I started adding other stuff. Conferences, due-dates for call for papers. Important dates I definitely didn’t want to miss or was looking forward to. I decided to mark those my drawing a circle around that date and writing a 1-word description next to it.

And imagine what? It works great!

The reason that this is useful to have in addition to your existing calendar is that whenever the scale of your calendar-view changes, the significance of appointments changes as well. It’s quite easy: when I’m looking at a day-view I want to know exactly at what time what’s going on. I need to go to get a haircut at 3pm. Great, I’ll be there. On month-view however the time isn’t important – just the fact that I’m getting a haircut on that date will suffice. On a year view, seeing all year’s appointments simply would result in one big unusable and unreadable list. It’s far too detailed!

That’s exactly what this calendar next to my desk does: it gives me a filtered view of the year, only showing the appointments that are important on a year’s-view. I don’t mark my tennis lessons there every Friday. Neither do I put down when I’m getting a haircut – I’ve got Google Calendar to do that. The year calendar is for stuff like days off, conferences, deadlines of projects etc. It’s hanging next to my desk, so whenever I want to know something like that, I simply have to turn my head and I see directly what’s going on.

Everybody is different, and especially for these kind of things it’s important to not just do as you’re told, but choose the solution that works best for you. However, the calendar has worked great for me so far – so maybe it will work for you as well. Also, figure out your own standard – use different colors with different meanings. Write stuff next to it. Draw arrows. Maybe use certain shapes for lining-in certain dates. Get creative! It’s analogue, the sky is the limit!

The calendar I use was generated by this pre-historic but still working CGI script: http://cgi.dit.nl/kalender.cgi (dutch). Be sure to mark “Enkel de kalender (printen)” at the bottom, and off course any other options you prefer. English versions available as well and easy to find. You can find an example here: http://www.freeprintablecalendar.net/2011/printcalendar.aspx.

Video: me doing my database version control talk at PHPNW10

As mentioned before I’ve done my talk “database version control without pain” a couple of times last year. When I did my talk at PHPNW10 in Manchester the organisation made videos of the different speakers, and last week they published the video of me doing my talk on blip.tv.

I found it quite shocking to see myself doing my talk. I’m sure we’ve all been there: you record your own voice (for example on you answering machine) and when you hear it back it sounds really awkward. It’s that feeling, but then with video… in a foreign language… 1 hour long. Anyway, I learned a lot from seeing this back, and I can definitely use the video to improve my future talks. For example: I now see I really need to be more aware of where I keep my hands when I’m speaking, and that my English still has this annoying dutch accent going on :-)

Anyway, if you missed the talk or if you want to see it again, or if you simply want to see me speak in front of an audience (hi mom!) this is your chance!

Video: Database Version Control without pain at PHPNW10
View the video on blip.tv

The slides hard to read on the video because of the low picture quality. Therefor I’d recommend keeping the associated slides next to it and click along with me.

Many thanks to PHPNW, Magma Digital and everybody else involved in organizing this great conference and making and publishing the videos.

Sharing slides is a gesture

Whenever I do a talk, it always surprises me how many people ask about my slides. Some people couldn’t make it to the conference and ask for the slides instead, others want to have the slides as a reminder of the talk and some of them might even use the slides to do a slide karaoke back at the office. Great purposes, and usually I’m more than happy to share them with the world. What surprised me even more however, is that some people seem to get upset whenever a speaker decides to not share his slides – or simply forgets about uploading them, or hasn’t come round to doing it just yet. “Whenever a speaker doesn’t upload his slides to slideshare, God kills a kitten”, seems to be the general thought. I disagree for a number of reasons.

My talk is more than just my slides

Me, talking about pizza at the DPC

Photo by Jeroen van Sluijs

First of all, talks are usually more than just the slides. For example: during the most important part of my last talk, my slides showed a picture of a slice of pizza for about 3 minutes. By browsing through someones slides you might get a general idea on what the talk is about, but you’re missing out on the details – and probably the most important message the speaker is trying to get across. It gets even better when the speaker uses more “Presentation Zen”-like techniques. Thijs Feryn‘s talk “PHP through the eyes of a hoster” is a fantastic talk, but if you would download his slides you would mainly just end up with a bunch of beautiful, but meaningless photographs (and some keywords).

That however is the way slides should look! They should confirm and strengthen what the speaker is telling, not tell the story themselves. Therefore sharing your slides then is not only pointless, but whenever somebody is going to see your slides it’s likely that the viewer will get a crooked view of what you were trying to tell in the first place.

Sharing slides is a gesture
I share them however, and most speakers do. My main reason is that people can browse through the slides again afterwards, and remember what I told them. If I did a good job the pizza-slide will then remind the viewer of the used metaphor and what I was trying to tell when this slide was on the screen. Even people that didn’t see my talk might still get a clue about the contents by looking at the slides. They will probably miss the main clue, but at least they will see a couple of mentioned tools that are worth giving a try – and if that makes you happy that’s just great! In such a case I’m glad you at least found some benefit in the work I’ve done.

Sharing slides is a gesture though. Something extra the speaker does especially for you. Not sharing slides is not “evil”, it’s normal. There are plenty of reasons why a speaker wouldn’t upload his slides. Maybe he thinks the slides themselves shouldn’t be viewed because the viewer would miss out on so much background information and explanations that it makes the talk look plain and stupid. Maybe there’s a reason like copyright restrictions on used photographs, or maybe the speaker doesn’t want to share his slides because he wants to do the same talk somewhere else next month and he doesn’t like it when people in the audience are already reading his slides before he has even started the talk.

Either way, it’s the speaker’s choice and not something that should be taken for granted. Sharing slides is a little bit extra a speaker did for you, and worth saying “thank you” for. Speakers don’t owe you anything, they usually don’t get paid (worse: they often even have to pay for their own expenses just to be at the conference) and they’ve usually put many hours into preparing their talks. All just for you! So next time, be grateful for the efforts they’ve put into it and give them a beer, instead of bitching about speakers that for whatever reason chose not to share their intellectual property with the entire world.

Burying a talk: a year of database version control

Speaker badges

About a year ago the idea was born to do a talk on database version control. Main reason: I didn’t really have a clue about database version control myself. What’s the right approach? What tools are out there? Am I doing database version control the right way, or is there a better way? While figuring all this out I decided to document all my steps, and finally use the results in a talk so I could share them with the rest of the world. I submitted a talk called “Database version control without pain” to the dutch conference PFCongres, and it got accepted!

Zwolle, Amsterdam, Manchester and Barcelona

Soon after that I got to repeat the same talk at other places in Europe. During these events I met some great people, had a lot of fun, saw cities I had never been before and I learned a lot as well. I tried to improve my talk based on the feedback I got, and sure enough most people seemed to like it!

In total, I did the talk four times now:

Enough!

I think four is enough. I assume that the majority of the European PHP community has visited at least one of the events above, and had the chance to see the talk (and besides that I’m getting a bit tired of repeating the same story over and over again ;-) ). It’s time for something new! I don’t exactly know what it will be, but I have several ideas on new talks so hopefully I can make at least one of those ideas into a talk, and start submitting to the calls for papers at the different conferences really soon!

Article on database version control

As a last hoorah I’m planning to write an article on database version control, which will probably be published either here or on techportal. Missed the talk? Looking to refresh your memory? Or downloaded the slides and you’re still wondering what the heck I was talking about during the slide with the picture of a sunken boat? No worries! Just keep an eye on this site and you’ll be able to read all about it soon.

But for now: Farewell database version control talk! You’ve been a great friend!