Some time ago, in between working on Zend Framework, I booted up a couple of libraries that I really wanted to integrate into my workflow. Recently, I’ve been being putting these through the grindmill so they can be properly released and supported for public consumption across PEAR. Just as Mockery fell out of older work on PHPMock, Mutagenesis will fall out of another project called MutateMe. This is a short introductory article as to what Mutagenesis will do and why. In other words, what the heck is Mutation Testing?
First, some background.
The most common means of measuring confidence in a test suite is the Code Coverage metric. Code Coverage essentially checks, on a per class basis, how many of the lines of code in the class are executed by a test suite and expresses this as a percentage. For example, a Code Coverage of 85% means 85% of the lines of code in a class was executed and 15% were not. The greater the number of lines of code executed, the more confidence one can presumably have that a test suite is doing its job, i.e. verifying class behaviour, preventing the introduction of bugs, supporting refactoring, and so on.
I have a huge and insurmountable problem with Code Coverage. For starters, my average Code Coverage is closer to 80% than the 90% expected of projects such as Zend Framework. The gap is explained by me not testing what I call “braindead” functions, i.e. methods which are either ridiculously simple, where a malfunction would quickly become self-evident, or which are marginalised (on the borders of deprecation). So Code Coverage actually increases the amount of work I need to do for very little gain and a lot of frustration.
Secondly, Code Coverage is easy to spoof or misinterpret. Since it’s a metric measuring the execution of source code, you need only…well…execute the source code. It’s a simple matter to construct a series of wonderfully useless tests to do just that and obtain a high Code Coverage result – it’s done all the time in my experience once someone’s patience in writing quality unit test runs out. It is particularly evident in cases where unit tests are written after the source code is completed – a still too common practice in PHP. The less villainous flipside is that certain nuggets of source code are fundamentally difficult to test. For example, a complex algorithm suffering from poor documentation may make composing a suitable unit test near impossible. The rollout of OAuth was filled with such examples.
This leads into my opinion of Code Coverage. I view the venerable Code Coverage metric as a near pointless exercise. While it may tell how much source code a test suite exercises, it tells you nothing about the actual quality of those unit tests. They could be good tests, sort-of-good tests or absolutely horrendous tests – Code Coverage will never tell you either way. I say near pointless because there are precious few alternatives. We need something to give us a reason to trust and have confidence in test suites and Code Coverage is easy to implement and has been a part of PHPUnit since forever. So, by and large, we make do. We measure Code Coverage just to make certain some kind of unit testing was performed.
Is there nothing better?
A good unit test serves a simple purpose. It verifies a behaviour of an object. In PHP, we’re more likely to verify umpteen million behaviours in a single test (count your assertions!) but we’ll let that slide. Since a test verifies behaviour, it follows that a test should fail when that behaviour is changed. If a test does not fail when class behaviour is changed, it also follows that the original behaviour was not fully tested, i.e. there is a gaping hole in our test suite whether due to a flawed or missing test that could allow bugs entry into our application. So, to really stick unit tests under a microscope to assess their quality and our confidence in them, we need to introduce changes into the source code under test and see if the unit test suite can or cannot detect them.
This process is known as Mutation Testing. Mutagenesis is a Mutation Testing framework for PHP 5.3+.
Mutation Testing, as you have probably surmised, is not a super-complex activity. You take a set of source code and compile a list of possible “mutations” that are likely to break the behaviour of the source code. Then, you apply one mutation to that source to create a “mutant”, i.e. a copy of the source code with the mutation change applied. Next, you run the source code’s test suite against the mutant and see if any tests fail. If a test fails, celebrate – the mutation was detected so your tests were, in this instance, adequate. If no test fails, curse the Gods – the mutation was not detected and you’ll need to figure out whether a new test is needed or an old one modified/corrected. Rinse and repeat the above for each mutation you’ve compiled.
Mutations are typically quite simple such as replacing operators, booleans, strings and other scalar values with either an opposing form or a random value. Expressions might also be reversed or driven to zero to give an opposing boolean or zero value. Making such minor changes seems like a minor irritation but behind every serious flaw in an application is one or more smaller contributing errors. If your test cases can detect the potentially contributing errors, then there’s an excellent chance it would detect the bigger ones anyway. This is known as the Coupling Effect in Mutation Testing.
Some of you will be vaguely aware of Mutation Testing. In terms of implementations, Ruby has heckler, Python has Pester, and Java has Jumbler, Jester and a couple of others. Those who prefer Microsoft’s technologies can use Nester. There’s a running ryhme apparent since so much is inspired by the original Jester framework for Java. To my knowledge, Mutagenesis will be the only Mutation Testing framework for PHP (though I sincerely wish I was wrong).
Examining those libraries, you eventually realize a few problems with Mutation Testing which explain its lack of popularity until relatively recently: performance is a concern and Mutation Testing requires a Human Brain to complete the process.
Performance is a concern because each mutation requires a test suite to be executed. Imagine a set of classes from which you extract 100 possible mutations, coupled with a test suite that takes 5 minutes to run. A basic Mutation Testing framework (e.g. Ruby’s heckler) would therefore take 500 minutes to complete a Mutation Testing session. That’s 8.3 hours of continuous Mutation Testing. Mutation Testing for Zend Framework would be very interesting .
Similar to Jumbler for Java, Mutagenesis will utilise a few heuristics (shortcuts) to significantly improve performance without compromising results. We only need one single test to fail in order to rule that a mutation was detected and killed, so we can do a few things to boost performance:
1. Terminate the test suite on first failure/error or exception.
2. Execute test cases in order of execution time ascending (fastest first; slowest last).
3. Prioritise execution of last test case to detect a mutant to take advantage of same-class detection.
4. Log which tests detect which mutations, and prioritise those associations in subsequent runs.
The effect of the above is to speed up Mutation Testing by a significant degree. The final heuristic ensures that for gradually changing source code and tests, the first Mutation Testing process might take a while but subsequent runs will be significantly faster making them far more usable in a Test-Driven Development setting. Mutation Testing is best served with a healthy dose of efficiency.
The second reason for its lack of popularity is that Mutation Testing can’t analyse the logic of the source code under test. For example, an expression might accept any integer less than 10 to evaluate to TRUE. If the input from another class were 7, and a mutation were generated to swap this for a 9, then the associated unit test would still pass (the mutation of switching 7 for 9 still allows the <10 expression evaluate to TRUE). If you recall, if a mutant passes a test suite than we assume either the presence of a flawed test or the lack of a suitable test. Obviously, as the above suggests, this isn’t always the case. Mutation Testing can and often will report false positives.
Ruling out false positives, coupled with the need to improve test suites to detect more mutations, makes Mutation Testing a source of extra work. Who likes extra work least? Programmers, especially the lazy kind .
Mutation Testing is not a far fetched idea. The principles are sound and it beats the pants off Code Coverage when it comes to measuring what confidence we can have in our testing suites. It is still hampered, as a methodology, by the lack of good implementations in other programming languages. Mutagenesis, by adopting implementation heuristics from Java’s Jumbler, should avoid that fate and offer a decent framework in PHP that performs as well as can be expected.
Once it’s released…of course . Mutagenesis is in development but should see a fresh release in a couple of weeks alongside Mockery. I’ll be looking forward to seeing how people perceive it. Mutation Testing has zero presence in PHP to date but having something to complement Code Coverage can’t do any harm!
Until a few years ago, there were basically two tools you could use to generate API documentation in PHP: phpDocumentor and Doxygen. phpDocumentor was long considered the standard, with Doxygen getting notice when more advanced features such as inheritance diagrams are required. However, phpDocumentor is practically unsupported at this time (though a small group of developers is working on a new version), and Doxygen has never had PHP as its primary concern. As such, a number of new projects are starting to emerge as replacements.
One of these is DocBlox. I am well aware there are several others -- and indeed, I've tried several of them. This post is not here to debate the merits or demerits of this or other solutions; the intention is to introduce you to DocBlox so that you can evaluate it yourself.
Continue reading "Using DocBlox"
I’ve been active in trying to get pipelining more widely deployed, but to date I haven’t tested mobile browsers much. So, one VM and two test pages (20 images on each) later, I asked my twitter peeps to hit it with their phones while I was watching with htracr.
The results confirm what they saw; mobile browsers do pipeline, sometimes aggressively.
A normal build of Firefox (and pretty much any desktop browser today) looks like this:
As you can see, each connection pauses for each response; no pipelining. However, in Patrick’s builds of Firefox, we can see LOTS of pipelining:
Here, you can see the multiple requests in the packet displayed on the right, as well as the bright-red requests and responses that mean that multiple messages are happening simultaneously (here, the responses are 304s, so they’re not very big). Even though Firefox is pre-warming a bunch of connections, it doesn’t need them to validate 20 images per page, very quickly.
So, if the stuff brewing in Firefox is the cutting edge of HTTP pipelining, how are the mobile browsers doing it today? First, Opera Mobile:
Opera mobile is only using two connections, and doing some pipelining, but it could do more. Opera Mini from my iPhone (in forced HTTP mode) looks like this:
Again, you see some pipelining, but there’s a lot of connections being used, which on mobile might not be the best answer.
Finally, an Android browser snapshot:
Again, a fair amount of pipelining, and lots of connections.
I’m going to be doing some work on htracr soon, both to make it easier to export the results for sharing, and to give lots more information. If anybody wants to help out, that’d be great — just go to github!
Bradley Holt has posted an interesting article on why rapid release cycles are a good idea for Zend Framework major versions.
For a framework (and maybe for other software), I think the following rules are necessary in order for a rapid release cycle to work:
What are the concerns with a rapid release cycle? I’ll paraphrase, and then address, the major concerns that I’ve heard.
- Minimize backwards compatibility changes between major releases. Targeted and strategic refactoring, rather than major overhauls, are preferable if you are releasing often. Small backwards compatibility changes makes migrating from one major version to another much easier.
- Mark some major releases as “Long Term Support” (LTS) releases. Provide bug fix updates and security patches to these releases for three to five years. This provides a “safe” option to those who value stability and don’t want to upgrade very often. In the context of Zend Framework, it is obviously Zend’s decision if they want to take on this burden. If not, then I don’t think a rapid release cycle is viable.
Well worth a read.
Summer will end. When, how much, and why - I don't know. These are questions for financial analysts and investors, people with their ears (and attention) much lower to the market-ground than entrepreneurs can afford. But the signs of winter are all around us: persistently high unemployment, market shocks, ill-timed austerity measures. For a while, startupland can stay insulated from these broader forces, but not indefinitely. The LP's that fund booms are, after all, pension, municipal, and sovereign wealth funds. Consumers need disposable income to invest in the latest products, as do the companies who serve them and advertisers who reach them.
We've enjoyed these years of summer. But winter is coming.
Entrepreneurs should be prepared. Obviously, those who depend on raising money at a specific time in the future should be on their guard. Anyone whose plan is to "raise money in six months" is really saying, "I am planning on no significant financial crises happening six months from now." I wouldn't want to be making such specific predictions right about now. As every expert has been saying: if you can raise money on fair terms right now, do it. If you can spend money fueling your engine of growth, do it. If you need to double-down on a major pivot, on a drive to hit product-market fit, do it.
But we have much bigger questions to tackle about what will happen in winter. For example, entrepreneurship will suddenly stop being cool, and go back to being seen as risky, a little crazy, and a little dangerous. Those of us promoting the idea that entrepreneurship is a viable career option need to be ready. Right now, the "startup career" is an easy case to make. It's going to get harder. We need to be ready.
Those people working to nurture and support new startup hubs may see all of their hard work destroyed. I am especially worried about the burgeoning scene in places like New York. Will Union Square become another Silicon Alley? I hope not. We need to be thinking about this now. The endless networking groups that thrive on hype and sizzle will suddenly wither. Do we have enough groups that are focused on the nuts-and-bolts of real entrepreneurship to keep those ecosystems vibrant? Which kind of group are you investing your time and energy into right now?
I expect that a shocking number of the current crop of incubators, accelerators, and other startup-support programs will suddenly disappear. In summer, it's all-too-easy to have your program look like a success, because there is an endless supply of talented people becoming first-time entrepreneurs and an endless supply of investment dollars chasing them when they graduate. It's hard to know, in summer, which of these programs actually add value and which are glorified admissions officers. Winter will tell. If you depend on one of these program for support, be ready.
This may sound like all doom and gloom, but I'm feeling personally very optimistic. Hype gets in the way. Every ounce of energy invested in vanity metrics and success theater could have gone into building real value instead. As I've been saying in my talks for a while now, the real entrepreneurship - not the caricature from pop culture and mass media - is boring, tedious, and extremely difficult. It's anything but cool: product prioritization meetings, deciding which customers to listen to and which to ignore, and valiantly trying to keep the vision alive in the face of contradictory facts. To recruit people into thatbusiness, the real innovation business, should be our goal. I hope all of us are ready to reach out to those founders and would-be founders and nurture and support them through the hard times. That will create real value.
As I see it, the big opportunities to change entrepreneurship come in winter. During the last crisis, I was asked constantly for my advice on how to save money and cut costs. Most people didn't really expect my answer to be about the Build-Measure-Learn feedback loop and all the rest of the Lean Startup methodology. But the truth is: to save money, we have to cut any costs that are slowing down our ability to find validated learning about whether we're on the path towards a sustainable business. Cutting any other costs just help us go out of business more slowly.
But this begs the question: if we're spending money on something that is slowing us down, why are we doing it all? And why did we have to wait for a financial crisis to cut those costs? Why not cut them now?
Fall is a pretty good season to get serious about discovering which actions really contribute to creating value and which are waste. It's harder to act in a disciplined way in summer. All around you, you see excess and nonsense, companies being bought or funded for zillions of dollars without traction. It's hard to stay focused. Remember: most of those "lucky" companies die inside their new parent companies. Remember: in the long run, the surest way to be successful is to create more value than you capture. Remember: the truly great entrepreneurs didn't get in this to make money, but to change the world. Stick to that plan, and - even if you fail - you'll feel good about yourself in the morning, in any season.
What is the male/female ratio?
The answer to this question is usually 5% max female.
Sometimes people then look at me expectantly for me to explain what I am going to do about this, and I usually look a little bit scared.
To be honest, finding developers of any age or gender, willing, talented and happy to either volunteer their time or give up a weekend (even if it is paid) to help government or organisations as they emerge blinking into the open digital world, is challenging enough. But to answer the girl question – so far I have been at a loss really, and sometimes irritated by the question. Why is no one ever happy?
But yet… it is an important question; and pertinent to me, as the mother of two daughters, one of whom is crying out to code, counts down the days to come to work with me on a hack day – and often fills in the memory gaps where I have missed vital sections of presentations.
Kidding, I don’t really know the answer, but Courtney Williams (a mentor from the National Museum of Computing, Bletchley Park, at Young Rewired State (YRS) this year) and Wendy Grossman (a freelance writer who followed and diarised YRS this year) have volunteered to look at some of the data we have and do some clever brainy things. This research will kick off in September 2011 and I will keep you all posted.
In the mean time, here’s my personal opinion based on a few years of working with developers of all ages, children of all ages and being a Mum of an aspiring girl-geek (9) (and a teenage daughter who has no interest whatsoever – she can be the ‘control’).
=========warning====personal opinion====not based on data====based on surmise and thinking=========
I find it relatively easy to understand and explain the lack of girl geeks, in YRS as well as in the grown up world. Here goes:
- girls get self-conscious and socially conscious at around puberty/aged 12-14ish
- coding and digital prowess is still niche at a young age, self-taught by the studious. Often considered a bit nerdy in senior school, where it is not (nor ever has been) taught as a part of the curriculum; therefore those who code have taught themselves. Teaching yourself something that should really be covered as a part of lessons, is a bit like doing extra homework – *why* (ask many teens) on earth would anyone do that?
- there is no way the majority of hormonally challenged, desperate to find their place in the world, teenage girls would risk ridicule or isolation by doing such a thing - let alone be open and proud about it? (Boys of the same age have different social challenges and do not measure their societal worth so much by peer review aged 13/14)
This is why I reckon YRS gets a higher female sign-up but greater drop-out rate just before the event. They sign up because they want to, they drop out because they cannot face the potential embarrassment <- if only they knew how heralded they would be by the achingly cool. But even the achingly cool kudos doesn’t win against the female peer group pressure.
What’s the answer?
Well, I hate to limit this to just the girl geek question, but perhaps in solving the problem of a dearth of female coders we can make a big dent in the broader problem of the lack of teaching any coding languages in the National Curriculum – anywhere.
Forget enticing computer science degrees, or trying to encourage teenage girls to pick up Python…
Year 8 is too late
Start teaching coding as a part of the curriculum in Year 5. At this point the maths is strong enough in most kids. The IT curriculum has fostered a familiarity with computing and computers and the young minds are ready to start learning programming languages. Indeed they are creative, excited and have not yet developed any association, good or bad, with certain subjects.
I don’t suggest replacing the teaching of IT, that really helps kids get to grips with spreadsheets and word processing skills (yeah OK, Microsoft products, but hey). This is a new subject, an emergent but critical one - as critical as the traditional STEM subjects with which we are all so familiar.
If it can be introduced as a part of the central curriculum in Year 5, I bet you my last penny that by the time those kids are drawn up through the education system, you would find far less of a disparity between the sexes – and maybe even an increased number of talented young people with an ability to manipulate open data, relate to code and challenging each other to design and build digital products that you and I have not even begun to imagine. Have a little imagine now… good innit?
Make one change: teach coding in Year 5 and thereafter, make it a part of the curriculum (as relevant and necessary as the traditional STEM subjects).
If you want to talk about this, share knowledge, do anything, then I will track what I can on twitter through the hashtag #yr8is2late
But I am only one person and this is not a personal campaign (yet) I want to do what I can, and I can share knowledge and experience, but it takes far more than YRS, Courtney, Wendy and myself to make a difference. And this difference would be for all young developers, not just female.
From time to time it’s time to talk together.
I thought that you might want to hear what happend within the last 4 months.
So I will give you the actual state of my work.
GIT & ZF2
Since several months I focus completly on ZF2 development.
This means on the other hand that all new features and changes are instantly available for the next major release and must not be reworked.
When you want to follow me simply watch my github repository: https://github.com/thomasweidner
Actually I worked on fixing several small issues. Those are for example:
I added handling for floats within the Null filter. This has been missing in past.
Zend_Filter_Alnum/Alpha and EmailAdress had public variables. This has changed in favor of accessor methods to provide consistency to the other filters.
Now Zend_Translator can be used in phar libraries.
And auto-routing has been added. The users browser settings are used as routing information.
GreaterThan and LowerThan can now handle equal numbers.
IP can now handle hex, octal and binary noted IP adresses.
Behind the scene I am working on V2. It’s a complete rewrite of all locale classes.
Those are just a few examples.
In sum there are more than 80 issues which have been worked on and fixed since april.
Still much work to do.
Feel free to contact me directly when you think that an issue is missing or should be done first or also when you want to me to write about something in detail in my blog.
For coding help please contact the mailing list.
I18N Team Leader, Zend Framework
Zend Framework Advisory Board Member
Zend Certified Engineer for Zend Framework
A while ago Techcrunch profiled a company called Nodeable who closed 2 mil funding. They bill themselves as a social network for servers and have some cartoon and a beta invite box on their site but no actual usable information. I signed up but never heard from them. So I’ve not seen what they’re doing at all.
Either way I thought the idea sucked.
Since then I kept coming back to it thinking maybe it’s not bad at all, I’ve seen many companies try to include the rest of the business into the status of their networks with big graph boards and complex alerting that is perhaps not suited to the audience.
These experiments often fail and cause more confusion than clarity as the underlying systems are not designed to be friendly to business people. I had a quick twitter convo with @patrickdebois too and a few people on ##infra-talk were keen on the idea. It’s not really a surprise that a lot of us want to make the events stream of our systems more accessible to the business and other interested parties.
So I setup a copy of status.net – actually I used the excellent appliance from Turnkey Linux and it took 10 minutes. I gave each of my machines an account with the username being their MAC address and hooked into my existing event stream, it was all less than 150 lines of code and the result is quite pleasing.
What makes this compelling is specifically that it is void of technical details, no mention of /dev/sda1 and byte counts and percentages that makes text hard to scan or understand by non tech people. Just simple things like Experiencing high load #warning This is something normal people can easily digest. It’s small enough to scan really quickly and for many users this is all they need to know.
At the moment I have Puppet changes, IDS events and Nagios events showing up on a twitter like timeline for all my machines. I hash tag the tweets using things like #security, #puppet, and #fail for failing puppet resources. #critical, #warning, #ok for nagios etc. I plan on also adding hash tags matching machine roles as captured in my CM. Click on the image to the right for a bigger example.
Status.net is unfortunately not the tool to build this on, it’s simply too buggy and too limited. You can make groups and add machines to groups but this isn’t something like Twitters lists thats user managed, I can see a case where a webmaster will just add the machines he knows his apps runs on in a list and follow that. You can’t easily do this with status.net. My machines has their fqdn as real names, why on earth status.net doesn’t show real names in the timeline I don’t get, I hope it’s a setting I missed. I might look towards something like Yammer for this or if Nodable eventually ships something that might do.
I think the idea has a lot of merit. If I think about the 500 people I follow on twitter, its hard work but not at all unmanageable and you would hope those 500 people are more chatty than a well managed set of servers. The tools we already use like lists, selective following, hashtags and clients for mobiles, desktop, email notifications and RSS all apply to this use case.
Imagine your servers profile information contained a short description of function. The contact email address is the team responsible for it. The geo information is datacenter coordinates. You could identify ‘hot spots’ in your infrastructure by just looking at tweets on a map. Just like we do with tweets for people.
I think the idea has legs, status.net is a disappointment. I am quite keen to see what Nodeable comes out with and I will keep playing with this idea.
If you used vagrant (great tool, right?) you have probably downloaded a basebox from some remote location to get you started. This is a great quick start, and there are many good boxes out there that you can use; vagrantbox.es does a great job in listing various public vagrant boxes. But if you are like me, you probably will want to customize the boxes you are using; you might want to install them from scratch based on your own little/or/big customizations. Well if you are like that, then you will be happy to hear that Patrick Debois had exactly the same problem when he decided to write veewee. And veewee is exactly that missing part of vagrant that allows you to easily build your own vagrant boxes from scratch.
So let’s see how we can use veewee. I’m assuming you already have vagrant installed (and virtualbox), but if you don’t please install them first. To install veewee we just have to install the veewee gem:
gem install veewee
once you installed veewee you can see a new task added to vagrant: basebox.
Here is the list of the templates we get out of the box once we install veewee:
vagrant basebox templates
The following templates are available:
vagrant basebox define '' 'archlinux-i686'
vagrant basebox define '' 'CentOS-4.8-i386'
vagrant basebox define '' 'CentOS-5.6-i386'
vagrant basebox define '' 'CentOS-5.6-i386-netboot'
vagrant basebox define '' 'Debian-6.0.1a-amd64-netboot'
vagrant basebox define '' 'Debian-6.0.1a-i386-netboot'
vagrant basebox define '' 'Fedora-14-amd64'
vagrant basebox define '' 'Fedora-14-amd64-netboot'
vagrant basebox define '' 'Fedora-14-i386'
vagrant basebox define '' 'Fedora-14-i386-netboot'
vagrant basebox define '' 'freebsd-8.2-experimental'
vagrant basebox define '' 'freebsd-8.2-pcbsd-i386'
vagrant basebox define '' 'freebsd-8.2-pcbsd-i386-netboot'
vagrant basebox define '' 'gentoo-latest-i386-experimental'
vagrant basebox define '' 'opensuse-11.4-i386-experimental'
vagrant basebox define '' 'solaris-11-express-i386'
vagrant basebox define '' 'Sysrescuecd-2.0.0-experimental'
vagrant basebox define '' 'ubuntu-10.04.2-amd64-netboot'
vagrant basebox define '' 'ubuntu-10.04.2-server-amd64'
vagrant basebox define '' 'ubuntu-10.04.2-server-i386'
vagrant basebox define '' 'ubuntu-10.04.2-server-i386-netboot'
vagrant basebox define '' 'ubuntu-10.10-server-amd64'
vagrant basebox define '' 'ubuntu-10.10-server-amd64-netboot'
vagrant basebox define '' 'ubuntu-10.10-server-i386'
vagrant basebox define '' 'ubuntu-10.10-server-i386-netboot'
vagrant basebox define '' 'ubuntu-11.04-server-amd64'
vagrant basebox define '' 'ubuntu-11.04-server-i386'
vagrant basebox define '' 'windows-2008R2-amd64-experimental'
This means that we can build a box based on any of the above templates. That’s awesome! Let’s say we want to build a debian squeeze box using veewee; we would have to run:
vagrant basebox define 'debian-60' 'Debian-6.0.1a-amd64-netboot'
and this will create a folder definitions/debian-60 with the following files (the content of the veewee template):
we can modify/tune any of those files based on our custom needs. The file definition.rb is the main definition of the template. Here you would define the memory size, disk size, iso file, etc. The content is very easy to understand, but you would normally not have to change many things here. preseed.cfg is just a standard preseed file where you would customize the actual install process (you could change here the partitions or their type, timezone setup, etc). And finally postinstall.sh that is a bash script that will run at the end of the installation process and it will install ruby, gems , chef and puppet and also the virtualbox guest additions (needed for shared folders).
If you have the iso already place it in ‘currentdir’/iso. If not, veewee will download it and place it in the appropriate folder before starting the install process:
vagrant basebox build 'debian-60'
this will start the installation and you can see all the steps it takes (the keystrokes as they are entered, etc.). This can take a while… Once it is done you can validate the build with:
vagrant basebox validate 'debian-60'
(this will run a few basic tests to see if it can connect to the vm as user vagrant, if chef and puppet were installed, if the shared folders are accessible, etc).
And finally you can export it as a vagrant box with:
vagrant basebox export 'debian-60'
and add it to vagrant:
vagrant box add 'debian-60' debian-60.box
and now you can use it in vagrant with:
vagrant init 'debian-60'
That’s it. Very simple and now we have our own box built from scratch. As a side note, I found this very useful to test and troubleshoot preseed configurations . As you can see there are plenty of templates available in veewee but if you create a new one please consider to share it with others and send it to Patrick on github. I’m sure he will be happy to include it in newer versions of veewee. And if you found this useful don’t forget to thank Patrick for his great work on this awesome tool.
This post is a follow on from the one I wrote about how we need to start teaching children to code in their junior years (Year 5 is my stab in the dark). This would address the issue of fewer female coders than male, and the fact that not enough people are equipped with this super awesome skill whether their career ends up being in programming, car manufacture or shoe design. The post received such a wealth of feedback in the comments that I could probably write a blog post every day of the year to explore all of the stuff raised in there – I won’t but I will try to draw out some.
In this post I am going to answer the question:what resources can we use to learn or teach code? This seemed to be the question immediately raised in the comments on the post and on twitter, so I have simply read all of the comments and looked at the products and listed them all here for you to use as a resource. I am pretty sure that commentors will leave further links in the comments on this post.
However, before I continue, John Godfrey, one of the commentors on my last blog post left a link to this video. It’s just over an hour long, by Randy Pausch and I would love it if you could all watch it if you haven’t already, as well as read this list of resources! Bear with it, you will learn some excellent things as you watch, but you will also see the insight and inspiration behind Alice, one of the suggested links included below. (If you don’t have an hour or so free right now, then come back to it, but watch the ten ish minutes from this point in the video) otherwise watch the whole thing here:
Deep breath… here is a list of resources (including Alice)
Scratch got many thumbs up from commentors on the last post and was indeed the basic skill many of the Young Rewired Staters attending this year’s coding challenge. Here is the blurb from Scratch:
Scratch is a programming language that makes it easy to create your own interactive stories, animations, games, music, and art — and share your creations on the web.
As young people create and share Scratch projects, they learn important mathematical and computational ideas, while also learning to think creatively, reason systematically, and work collaboratively.
A few people mentioned this. It *is* good, but it is also a bit
boring dated. It is described as “the term used to describe a range of programs that in various ways provide the user with the means of controlling the movement of an object on the screen ( often a turtle)“. So yes, it is a great thing but would need some good teacher skills to make it relevant and exciting, I think. Or it could be a very painful and dreary maths lesson.
Logotron has a list of resources etc for teaching and teacher.
After insisting on you watching Randy Pausch’s lecture, how could Alice not feature highly? Alice is a 3d programming environment, designed to “create an animation for telling a story, playing an interactive game, or a video to share on the web. Alice is a teaching tool for introductory computing. It uses 3D graphics and a drag-and-drop interface to facilitate a more engaging, less frustrating first programming experience.”
So there is Alice 2.0 and Alice 2.2 as well as Story Telling Alice. The latter was the one mentioned by Randy as being developed by Caitlin Kelleher and is “… designed to motivate a broad spectrum of middle school students (particularly girls) to learn to program computers through creating short 3D animated movies.” <- danaaaa!! You can download Story Telling Alice here, but it is not hugely tested, is only available for windows based machines, has no support – but I certainly plan on playing about with it with Amy (9).
‘Proper’ Alice has full support and documentation and teaching materials and so on.
Android was recommended as an easy way to start mobile programming – “Android is a mobile operating system for mobile devices such as mobile telephones and tablet computers developed by the Open Handset Alliance led by Google.”
Indeed just looking at all of the web resources to help a person get started in Android programming, I can see why it came so highly recommended. So I found this on the Code Project website and it is a great tutorial. This is a great starting point for teenagers/newby adult coders, frustrating for littler ones unless they are already into this. I lost quite a few hours researching these links for Android programming, and where you can go from there. So be warned, Googling ‘Android’ might just mean that you can just sod off and go teach yourself everything you want to know really from a pretty decent standing start. There are bazillions of tutorials out there.
This software was recommended but it costs money. Not that I do not agree with people charging for providing such useful resources, of course, but just a warning. It is software used for “creating your own interactive Flash resources, activities, games, puzzles, quizzes.”
It is a resource really for teachers to use in schools, co-creating with children to use across subjects utilising the whiteboards (as well as websites and learning platforms). Wins an *applause* award from me for making it all relevant! But is very much aimed at younger learners.
Programmable lego *ends*
Other interesting links
Blitz Academy has a whole list of resources for those thinking about getting a job as a games developer (in fact the reading and link list is interesting for anybody even vaguely interested in anything)
Someone mentioned the Bytes Brothers books. Now that was an interesting hour lost! (Again – this post has taken nearly a week just because I keep disappearing down digital allies). So the most useful link I could find for these was here. Here’s the blurb: “Sort of a cross between Encyclopedia Brown and Micro Adventure, each volume in this series contains several short mysteries. The user must read carefully and run very simple BASIC computer programs in order to guess the solutions.”
I wrote another post a while back for the “inquisitive” it is for those reading this who want to try Python or Ruby, or even scraping websites.
I am not equipped with a teaching degree, so I cannot give equivocal advice on what to teach at whatever stage, however here is a great guide from Matthew (@pixelh8):
Year 5 = 9-10 age Computational thinking, logic, cause and effect (try Scratch, Google app inventor or Lego Mindstorms all visual based programming) or even Game Maker.
Year 6 = 10-11 age Should definitely be coding (try Processing very visual very quick feedback and free see http://pixelh8.co.uk/category/programming-in-schools/ for code examples and http://www.wired.com/geekdad/2009/11/teaching-kids-programmers/ )
Year 7 = 11-12 age try XNA, iPhone & Android dev the program doesn’t have to be complex or world changing you just have to show them a way in. Also they love being able to use and create on up to date tech.
Year 8 = 12-13 age some of the best iPhone developers are 13 years old.
I was sad to discover that photos belonging to both Violet and I are being reproduced without our permission at ‘photo encyclopedia’ Fotopedia.
I’m a firm believer in Creative Commons and copyright reform, and so I license practically all creative work I produce under a Creative Commons Non-Commercial Use Share Alike license. This includes all photos I take (even with my pro equipment) and all blog posts I write, including this one… basically anything which isn’t otherwise covered under an existing agreement with a client/etc or a personal photo of a family member (I don’t want those re-used) is CC-NC-SA.
The abuse of work licensed under a Creative Commons Non-Commercial License in a commercial setting – knowingly and unknowningly – is a well-written subject.
However it is sad and disheartening to discover companies that abuse the Creative Commons License who operate within the image/photo landscape and thus are placed to know better. It is even worse when the foundation of their entire company is based around this abuse.
You can go read their mission about about creating a photo-based encyclopedia for humanity, but the bottom line is that they are a commercial entity backed by $3.4m of venture funding that contains mostly Creative Commons Non-Commercial photos as the foundation of their company’s database.
In addition to their venture funding, they sell mobile applications that contain a sub-set of photos, offer other mobile applications for free but with sponsorship/advertising and they solicit commercial partnerships on their website.
Fotopedia is clearly a for-profit entity operating commercially and thus their use clearly falls outside ‘non-commercial use’.
C&D > DMCA, for now
Rather than filing a series of DMCA requests, which I’m legally entitled to, I have decided to send them a formal Cease & Desist letter due to the fact that the Creative Common’s license produces some ambiguity. I’d also like to open a dialogue with them rather than simply embark on a DMCA notice/counter-notice play.
However it is a difficult situation because if I’m right, and Fotopedia shouldn’t have any CC-NC photos on their site, then I can’t see how Fotopedia has any business. I think the Fotopedia service at its heart is interesting, it’s just a shame that it is being run by a commercial entity.
I will keep you posted with their response, but in the meantime here is a copy of my Cease & Desist letter in full.
FOR THE ATTENTION OF THE OFFICER IN CHARGE OF HANDLING COPYRIGHT COMPLAINTS, OTHERWISE THE CHIEF EXECUTIVE OFFICER
Dear Sir or Madam:
It has come to my attention that you are operating a web site found at http://www.fotopedia.com. Your web site contains the following copyrighted images belonging to myself, Ben Metcalfe. These can be found at:
These photos have been licensed under the Creative Commons Attribution Non-Commercial 2.0 Generic license (http://creativecommons.org/licenses/by-nc/2.0/deed.en) and as such may only be used, reproduced or have copies made without permission of the original copyright owner in situations where there is no commercial activity occurring.
However your use of the above images on your website “Fotopedia” clearly falls under “Commercial Use” based on the following criteria (but not limited to):
- You are a Delaware Corporation having raised $3.4m of venture funding (http://www.crunchbase.com/company/fotopedia)
- You solicit commercial opportunities on your website (from http://www.fotopedia.com/company/mission: “For business and partnerships: email@example.com”)
- On pages such as http://www.fotopedia.com/wiki/Apture, which includes a copyright image I own, there is an advertisment for your “Above France” iOS application which is a commercial application costing $2.99 in the Apple App Store.
As an aside, I would like to bring to your attention that you have attributed an incorrect Creative Commons license on mine and other images on your site. For example http://www.fotopedia.com/items/flickr-152520511 points to a CC-BY-NC 3.0 Unported license (with url http://creativecommons.org/licenses/by-nc/3.0/) when the original photo on Flickr clearly links to a CC-BY-NC 2.0 Generic license of a different url (http://creativecommons.org/licenses/by-nc/2.0/deed.en). While similar, these two licenses are not identical which further suggests a misunderstanding or bad-faith of Creative Commons Licensing on your company’s part.
Given the above, your use of these copyrighted images falls outside of the narrow permissions of non-commercial use under Creative Commons and as such you have neither asked for nor received permission to use these images, nor to make or distribute copies, including electronic copies, of same. I believe you have willfully infringed on my rights under 17 U.S.C. Section 101 et seq. and could be liable for statutory damages as high as $150,000 as set forth in Section 504(c)(2) therein.
I hereby demand that you immediately cease and desist use of these images.
Based upon the foregoing, I hereby demand that your confirm to me in writing within ten (10) days of receipt of this letter that: (i) you have removed the aforementioned infringing images from your site; and (ii) you will refrain from posting any similar infringing material on the Internet, Application or any other service you control in the future. If you do not comply with my request to remove the infringing images from the web site within ten (10) days from the date of this letter, you will leave me with no other choice but to pursue all available legal and equitable remedies against you.
Yesterday Opscode, the company behind Chef, announced the first ever chef cookbook contest. In order to participate in the contest you will need to write a new cookbook and submit it by the end of September; this is going to be a little tricky as there are many cookbooks already available on the community site. So this is a great idea and it will take care of the few applications that don’t already have chef cookbooks. The cookbooks which shows off the awesome Chef features will have better chances to win. The prizes are also interesting: iPad, gift cards, etc. Here are the full details and rules of the contest: http://www.opscode.com/blog/2011/08/22/cookbook-contest/
So if you have an idea for a chef cookbook, now it’s the time to start working on it. I’m offering my help for free for all my blog readers: I will help you write a cookbook by implementing your ideas; help reviewing it or suggest improvements, or whatever else you might need help with. Use the contact form to email me (or DM me on twitter) and let me know how I can help.
If you don’t have time to write a new cookbook but you have a great idea for a cookbook that is missing from the opscode community site, please post it bellow in the comments section and I’m sure some of my blog readers will help create it.
Again this is a brilliant idea from Opscode and it creates a win-win situation for everyone. I’m just curious, is this the first idea from their new community manager? If this is the case, great job Jesse .
For the last couple of months we’ve been working with Pearson PLC, the global publishing company who own household titles including Penguin, the Financial Times, Ladybird and Dorling Kindersley. Pearson has showed some considerable innovation when faced with the challenge giving so many established publishers sleepless nights: how to make their existing content available to customers through new media channels. With this aim in mind they announced the launch of the Plug & Play Platform intended to engage external developers and encourage the use of their data in many more diverse ways than they could hope for if handling all development internally. To achieve this, they needed an API.
We’ve been collaborating with the Plug & Play team towards the launch of APIs for three of their datasets, now live at the Pearson developer portal. The three datasets currently released are:
- Dorling Kindersley’s Eyewitness London Travel Guide: information about things to do, places to see, where to go out, eat and sleep when visiting the UK capital. The data is the same that forms their well known guide books. The API allows search based on category, location, and free text search.
- Longman Dictionary of Contemporary English: definitions, illustrations and pronunciations searchable by category, word or part of word.
- Financial Times Press: searching over 500 articles on business, management, marketing and finance.
Its been a great experience to be part of opening up such rich content to developers. The API management system is provided by Apigee, and we at Placr developed the API functionality using the ruby on rails framework. A big issue for any publisher considering releasing assets historically released in print is the granularity at which to expose the data: should a fragment comprise an entire book? A chapter? A paragraph, sentence or individual word? Considering the granularity developers need, and transforming datasets into appropriate structures, has thrown up some interesting challenges. Being flexible whilst keeping the API simple to use have been guiding principles. We’ve also worked hard to make the content available both in structured machine readable formats (XML, json, json-p) and as a content API, that can be searched and browsed by anyone with an API key using a web browser.
Apps powerered by these APIs are already starting to appear. Developers Metia
have released the Android app Show Me London based on the Eyewitness Travel Guide API, and Tigerspike have written a blog post describing the development of their app, built on the FT Press API.
Reaction to Pearson’s Plug & Play project has been very positive. Kin Lane on his API evangelist blog said:
“the Pearson Plug & Play team has done a great job with their pioneering efforts in this space … The team launched with a diverse set of APIs that not only provide valuable content to developers, but also gives Pearson a good place to practice when it comes to serving up content via an API.”
The reaction to the APIs on twitter has been illuminating. Phaseit asked the rhetorical question:
“Did @pearsonplc just prove itself an order of magnitude more progressive than I believed?”
Yep. I think maybe they did.
It used to be that when you registered a media type, a URI scheme, a HTTP header or another protocol element on the Internet, it was an opaque string that was a unique identifier, nothing more.
Sure, there are some substructures (e.g., vnd. and prs. in media types) to aid in avoiding collisions, but they don’t actually do anything. And even so, they need to be used judiciously (e.g., the problems inherent in x-).
However, it’s now becoming fashionable to hang specific behaviours off of prefix and suffixes.
For example, XmlHttpRequest gives special status to the Sec- prefix in HTTP headers; basically, if a very modern browser sees a header with it, they won’t allow it to be set using setRequestHeader().
That’s all fine and great for the folks who set up XHR; their use case has been met. However, what about the next lot that come along and want to give their headers a special prefix to do something cool? Do we come up with a weird Sec-Foo-Bar syntax?
In other words, using a prefix / suffix notation for your special use case is very workable the first — and only the first — time.
That’s not to say that XHR is alone. Many people assume that Content- is a special prefix in HTTP, and RFC2616 can be read as giving headers with that prefix special treatment in some circumstances; however, HTTPbis is removing that particular inference.
Likewise, once upon a time PEP (later RFC2774) devised a whole system of dynamic prefixes to allow distributed extensibility in HTTP. I’ve talked before about why this is philosophically flawed (and, indeed, evil), but it’s practically unworkable too, because the HTTP header registries don’t disallow people from registering new headers with those prefixes*. Luckily, exactly no one uses PEP**.
Most recently, HTML5 has defined a “web+” URI prefix to sandbox off a set of identifiers for Web applications. Again, it’s great for them, but what about for the next lot that want to put some semantic sauce into their URIs?
There’s a useful comparison to be made to a very similar syntactic convention in media types, +xml suffixes. This is actually being codified in the latest drafts of the media type registration procedure, so that you can register other formatting conventions, such as JSON, as +suffixes.
It’s easy to say “look, they’re using suffixes in media types, why can’t we use them in URIs too?” However, there are two crucial differences. First, the registration procedures are being updated to reflect the convention, and second, media type suffixes describe exactly one dimension, so there’s no potential for conflicting uses.
In other words, you’ll never have a case where two different +suffixes can both occur on a media type, because they’re defined to be mutually exclusive. application/foo+xml+json doesn’t make any sense. OTOH, unconstrained definition of these kinds of conventions can tie the hands of the whole Web without anyone even realising it, which is enough reason for caution.
All of this will become painfully obvious as soon as the second group comes along and designs a killer-app, must use prefix for HTTP headers or URI schemes.
Yes, I’m looking at you, browser folks. Good on you for moving first and winning the land grab, now let’s make sure we don’t all have to live with a mess for the next 30 years.
First, such conventions need to at least consider what the space of other values is. I’d argue that establishing a prefix for just one use case — even if it is huge, like Web browsing — is wasteful overkill, and should be avoided. The +suffix on media types makes sense, because having a formatting convention is a very common thing, and there’s real value in putting that information in an identifier.
While it might be pragmatic to stuff things into protocol elements, it’s not a long-term solution, and you have to think long term when you’re doing this stuff. OTOH, if your use case is important enough to justify the convention (and hey, XHR security might just be that use case), and there’s no other way to do it (I doubt…), maybe it’s the right thing to do. Which brings us to…
Second, if you’re going to establish a syntactic convention for a protocol element, it really really needs to be reflected in the registration procedures. Work with the appropriate people at the IETF; URI scheme registration procedures are already under revision, and there’s talk brewing for HTTP headers too. Shoving your convention down other peoples throats by shipping it first and asking questions never isn’t just anti-social, it’s actively counter-productive. Look how long it too to clean up the Cookie mess, after all.
So, in the case of Sec- headers, I’ve argued before that the benefits in browser maintenance don’t justify the overall system costs that this approach incurs. It’d be easier to have the browsers auto-update the list of sensitive headers from your servers (isn’t this the direction browsers are going in anyway?).
The web+ URI scheme is similarly unnecessary, from what I’ve seen so far. E.g., why not just define a single URI scheme for the sandbox and trigger application-specific behaviours on some other aspect of the request? I haven’t seen the use cases for this one yet, so maybe I’m missing something.
Doubtless this will all be ignored, because it’s already being baked into code. All I can do is plead with people to think more than a release ahead; we’re going to be using this stuff — especially URIs! — for a long time, so let’s not muck it up.
And let’s not blame everything on the browsers. Recently Julian discovered that RFC5825 had modified the message header registry, to give special status to anything starting with Downgraded-. Although this was intended to just cover e-mail, it briefly ended up applying to HTTP and NNTP headers too. Oops.
* Granted, the registry post-dates PEP, and in the spirit of full disclosure, I was one of the people who set up the registry. But still.
** Except, apparently, Julian.