There’s something major missing in the Debian and Ubuntu world, and that’s a decent package management system. Yes, I’m going to get responses to this like “Why don’t you use Chef, or Puppet to do package management”, and I’ll give you the quick response for this: Though Chef and Puppet do configuration management well, they do not do package management well.
Package management is more than just saying “I want this package to be this specific version on these sets of systems”; package management gives you an overall view of your system in a number of ways. One view is the security profile and compliance of your entire network of systems. I should be able to quickly determine the CVE compliance of my systems. I want to be able to match a vulnerability to a package, quickly, and ensure all systems are patched for that. I should also have a view of which packages are installed on which systems, and be able to group them into system groups for easy reporting or actions.
Additionally, I want to be able to quickly and easily test packages before deploying them everywhere. I want to have a testing group of systems, a staging set of production systems, and the rest of the systems grouped, so that I can deploy packages in a sane way. I want to be able to stage package updates on a schedule so that I can do rolling package updates.
These two things are possible with Chef and Puppet, but aren’t easy (not even close to easy). These tools aren’t built for things like this. Landscape has a least a minimum set of these features. Spacewalk has all of these features.
Unfortunately, Landscape is proprietary, and Spacewalk doesn’t fully support Debian and Ubuntu.
Canonical, please open source Landscape
I understand that you need to make money somehow, but it’s somewhat hypocritical to run an open source company based on the profits of proprietary software. Red Hat seems to do pretty well without proprietary software, I’m sure you can as well. Red Hat even goes one step further and buys proprietary software just to open source it (see Fedora Directory Server, Dogtag, RHEV-M, etc.).
Don’t take this the wrong way, I’m happy with your open source offerings. The release schedule for Ubuntu, and Ubuntu server is essentially perfect for running Linux networks, and the LTS offering is great from the stability vs bleeding edge point of view; however, the lack of decent package management makes your offering excruciatingly painful to use on any large set of systems.
Fedora, help us with Spacewalk
Spacewalk is a great product. The functionality it offers for Fedora and CentOS (and RHEL via Satellite Server) is essentially perfect. It is by far one of the best management tools around. Please help expand this tool to Debian and Ubuntu.
Spacewalk version 1.4 greatly helps bring this support, but there is still a lot left to do. Of course, as I ask for the help of the Fedora community, I also hope the Debian and Ubuntu communities help with this as well.
<#comment hash="f92e3f4a596ee1383542fa82e3050512" /> <#comment hash="9d6ee31bc358db3224830f8469fa13c0" />
As I recently noted, we have teamed with Adobe to deliver a solution for developers to rapidly deliver
native, connected mobile applications. This joint solution based on Adobe Flash Builder, Zend Studio and Adobe AIR enables users to use a common code base and target multiple devices including iOS, Android and Blackberry. Passed are the days where you have to learn Objective-C, Java and other native frameworks.
A recent research report noted that the # of Mobile app downloads will reach 44 billion by 2016. That coupled with the fact that Apple has already paid out over $2 billion to developers for apps sold on the App Store creates a compelling story for a solution that helps developers leverage Web skills and an easy to use visual builder to deliver internet connected applications across devices.
Check it out at Zend.com’s product page or Adobe.com’s product page.
One area of displaying lists on web pages that I've generally disliked doing is pagination as it's a bit of a faff. Recently, I needed to do just this though as I couldn't delegate it as my colleague was too busy on other work. As a result, I thought that I should look into Zend_Paginator this time. Turns out that it's really easy to use and the documentation is great too.
The really useful thing about Zend_Paginator is that it uses adapters to collect its data. There are a variety of adapters, including array, dbSelect, dbTableSelect and iterator. The interesting ones for me being dbSelect and dbTableSelect as I use Zend_Db based data access layers.
This is how I used it with a Zend_Db based data mapper within TodoIt.
Setting up the paginator
My current method looks like this:
class Application_Model_TaskMapper { public function fetchOutstanding() { $db = $this->getDbAdapter(); $select = $db->select(); $select->from($this->_tableName); $select->where('date_completed IS NULL'); $select->order(array('due_date ASC', 'id DESC')); $rows = $db->fetchAll($select); foreach ($rows as $row) { $task = new Application_Model_Task($row); $tasks[] = $task; } return $tasks; } // etc
This is pretty standard code for a data mapper. We select the data from the database and convert it to an array of entities. For the paginator to do its stuff though, we have to pass it the select object so that it can set the limit() on the select object.
The code therefore becomes:
public function fetchOutstanding() { $db = $this->getDbAdapter(); $select = $db->select(); $select->from($this->_tableName); $select->where('date_completed IS NULL'); $select->order(array('date_completed DESC', 'id DESC')); $adapter = new Zend_Paginator_Adapter_DbSelect($select); $paginator = new Zend_Paginator($adapter); return $paginator; }
As you can see, we create an instance of Zend_Paginator_Adapter_DbSelect which takes the $select object and the instantiate a Zend_Paginator and return it. The Zend_Paginator object implements Interator, so you can use it exactly like an array in a foreach loop and hence, in theory, your view script doesn't need to change.
However, the code that consumes TaskMapper expects an array of Task objects, not an array of arrays. To tell the paginator to create our objects, we extend Zend_Paginator_Adapter_DbSelect and override getItems() like this:
class Application_Model_Paginator_TaskAdapter extends Zend_Paginator_Adapter_DbSelect { /** * Returns an array of items for a page. * * @param integer $offset Page offset * @param integer $itemCountPerPage Number of items per page * @return array */ public function getItems($offset, $itemCountPerPage) { $rows = parent::getItems($offset, $itemCountPerPage); $tasks = array(); foreach ($rows as $row) { $task = new Application_Model_Task($row); $tasks[] = $task; } return $tasks; } }
Here, we've used the entity-creation code that was in our original implementation of fetchOutstanding() and placed it in getItems().
Obviously we have to update fetchOutstanding() to use our new adapter, so we replace
$adapter = new Zend_Paginator_Adapter_DbSelect($select);
with
$adapter = new Application_Model_Paginator_TaskAdapter($select);
Now, when we iterate over the pagination object, we get instances of Task and all is well with the world.
Using the paginator
Now that we have a paginator in place, we need to use it. Specifically we need to tell the paginator which page number we want to view and how many items are on a page. Within TodoIt, this is done in the ServiceLayer object and looks something like this:
class Application_Service_TaskService { // ... public function fetchOutstanding($page, $numberPerPage = 25) { $mapper = new Application_Model_TaskMapper(); $tasks = $mapper->fetchOutstanding(); $tasks->setCurrentPageNumber($page); $tasks->setItemCountPerPage($numberPerPage); return $tasks; } // ...
Clearly the $page parameter comes via the URL at some point, so the controller looks something like this:
class IndexController extends Zend_Controller_Action { public function indexAction() { $page = $this->_getParam('page', 1); $taskService = new Application_Service_TaskService(); $this->view->outstandingTasks = $taskService->fetchOutstanding($page); $messenger = $this->_helper->flashMessenger; $this->view->messages = $messenger->getMessages(); } //...
and then the view uses a foreach as you'd expect.
Adding the paging controls
Finally, to complete a paged list, we have to provide the user a mechanism to select the next and previous pages along with maybe jumping to a specific page. This is done using a separate view script that you pass to the paginator. In your view script, you put something like:
<?php echo $this->paginationControl($this-> outstandingTasks, 'Sliding', 'pagination_control.phtml'); ?>
The first parameter is your paginator object. The second is the 'scrolling style' to use. There are four choices documented in the manual: All, Elastic, Jumping and Sliding. Personally, I have chosen to not display the page numbers themselves, so it doesn't matter which one I pick. The last parameter is the partial view script that you want to be rendered. This allows you to have complete customisation of the HTML.
Here's what I'm using which is based heavily on and example in the documentation:
<?php if ($this->pageCount): ?> <div class="pagination-control"> <!-- Previous page link --> <?php if (isset($this->previous)): ?> <a href="http://akrabat.com/<?php echo $this->url(array('page' => $this->previous)); ?>"> Previous </a> | <?php else: ?> <span class="disabled">< Previous</span> | <?php endif; ?> <!-- Next page link --> <?php if (isset($this->next)): ?> <a href="http://akrabat.com/<?php echo $this->url(array('page' => $this->next)); ?>"> Next > </a> <?php else: ?> <span class="disabled">Next ></span> <?php endif; ?> <span class="pagecount"> Page <?php echo $this->current; ?> of <?php echo $this->pageCount; ?> </span> </div> <?php endif; ?>
And that's it; I now have paginated tasks in TodoIt and as you can see, Zend_Paginator is very easy to use and, more importantly, simple to customise to your own needs.
This is a 100% technical conference with no marketing allowed, perfect for those that are only interested in the real stuff or those that want to get answers to problems they have *right now*.
The previous summit, held in San Francisco, was very well attended and I have heard a lot of good things about it from people that were there.
In San Francisco we had one of the MariaDB optimizer gurus holding a talk about all the advanced optimization we have added to MariaDB 5.3.
In NYC we have Kurt von Finck giving a talk about What's New In MariaDB.
Unfortunately I can't be there, even if I would like to attend :(
I was in the USA last month at the MySQL Conference and expo, and I will be in the USA again in June for Open Source Bridge. Then again in July for OSCon.
Even if I like to travel to the USA, once a month is a little too often when you live in Europe. Hope that Percona will host a summit in Europe soon ...
However, don't worry; Kurt will, of course, have with him in NYC some of the black stuff everyone is expecting from a Monty Program Ab employee.
Last not but least, for all the readers of the monty-says blog, you can get a 50$ discount to the Percona Live event by using the MONTYSAYS discount code. One never knows in what kind of places this discount code may work... ;)
We’re working on a series of two-day Stack Overflow conferences for the fall:
“What’s this conference about? The idea for the original DevDays was to have high-bandwidth, intensive introductions to a wide variety of new technologies… the kinds of technologies that everybody wants to learn but doesn’t necessarily need to use on a project right now. Last time, it was things like iPhone development, Python, jQuery, Google AppEngine, etc. This year, we’re asking you. So far, there’s a lot of interest in DVCS, HTML5, and Node.js.”
Need to hire a really great programmer? Want a job that doesn't drive you crazy? Visit the Joel on Software Job Board: Great software jobs, great people.
I launched the HTTP Archive about a month ago. The reaction has been positive including supportive tweets from Tim O’Reilly, Werner Vogels, Robert Scoble, and John Resig. I’m also excited about the number of people that have already started contributing to the project. Two new stats charts are available thanks to patches from open source contributors.
James Byers contributed the patch for generating the Most Common Servers pie chart. This chart is similar to BuiltWith’s Web Server chart. BuiltWith shows a higher presence of IIS than shown here. Keep in mind the sample sets are different – the HTTP Archive hits the world’s top ~17K URLs while BuiltWith is covering 1M URLs.
The other new chart comes from Carson McDonald. It shows pages with the most 404s. Definitely a list you don’t want to find your website on.
l’ve added some other features I’ll blog about tomorrow and am planning a bigger announcement later this week, so stay tuned for some more HTTP Archive updates.
CloudFlare is excited to announce that the WordPress performance plugin W3 Total Cache (W3TC) now fully integrates CloudFlare's performance and security. CloudFlare and W3TC's missions are aligned: making sites perform as fast as possible. If you're a WordPress user, W3TC runs on your server and helps optimize your database and content production. After that, CloudFlare takes over and our globally distributed network ensures your site's content is delivered as fast as possible while, at the same time, preventing attacks from harming your site.
I always have difficulties with complex analysis schemes, so fall back to something that is somewhat easier. Or much easier. Here I will explain the super-powerful method of database write workload analysis.
Doing any analysis on master servers is already too complicated, as instead of analyzing write costs one can be too obsessed with locking and there’s sometimes uncontrollable amount of workload hitting the server beside writes. Fortunately, slaves are much better targets, not only because writes there are single-threaded, thus exposing every costly I/O as time component, but also one can drain traffic from slaves, or send more in order to cause more natural workload.
Also, there can be multiple states of slave load:
- Healthy, always at 0-1s lag, write statements are always immediate
- Spiky, usually at 0s lag, but has jumps due to sometimes occuring slow statements
- Lagging, because of read load stealing I/O capacity
- Lagging (or not catching up fast enough), because it can’t keep up with writes anymore, even with no read load
Each of these states are interesting by themselves, and may have slightly different properties, but pretty much all of them are quite easy to look at using replication profiling.
The code for it is somewhat straightforward:
(while true; do
echo 'SELECT info FROM information_schema.processlist
WHERE db IS NOT NULL AND user="system user"; '
sleep 0.1; done) | mysql -BN | head -n 100000 > replication-sample
There are multiple ways to analyze it, e.g. finding slowest statements is as easy as:
uniq -c replication-sample | sort -nr | head
More advanced methods may group up statements by statement types, tables, user IDs or any other random metadata embedded in query comments – and really lots of value can be obtained by doing ad-hoc analysis using simply ‘grep -c keyword replication-sample’ – to understand what share of your workload certain feature has.
I already mentioned, that there are different shapes of slave performance, and it is easy to test it in different shapes. One of methods is actually stopping a slave for a day, then running the sampler while it is trying to catch up. It will probably have much more buffer pool space usable for write operations, so keep that in mind – certain operations that are depending on larger buffer pools would be much faster.
This is really simple, although remarkably powerful method, that allows quite deep workload analysis without spending too much time on statistics features. As there’s no EXPLAIN for UPDATE or DELETE statements, longer, coarser samples allow detecting deviations from good query plans too.
Systematic use of it has allowed to reveal quite a few important issues that had to be fixed – which were not that obvious from general statistics view. I like.
In the past six weeks, I've delivered both a webinar and a tutorial on Zend Framework 2 development patterns. The first pattern I've explored is our new suite of autoloaders, which are aimed at both performance and rapid application development -- the latter has always been true, as we've followed PEAR standards, but the former has been elusive within the 1.X series.
Interestingly, I've had quite some number of folks ask if they can use the new autoloaders in their Zend Framework 1 development. The short answer is "yes," assuming you're running PHP 5.3 already. If not, however, until today, the answer has been "no."
Continue reading "Backported ZF2 Autoloaders"
Last week Google Analytics announced that they would begin integrating Site Speed tracking. This is an awesome new way for site administrators to see how well their pages are performing. CloudFlare has always been a fan of Google Analytics. Since our public launch we've offered web administrators a way to ensure Google Analytics is installed on all our users' pages, in the best possible placement, and with the latest and greatest version. As soon as we heard about new Site Speed tracking, we made changes on our system to make the new version of Google Analytics available for anyone using CloudFlare. We tested the new code last week and, beginning today, have pushed it live throughout our network.
It’s great to see so much feedback coming in about my Qt 5 blog two days ago. We’ve read and gone through all the comments, but it’s easier to try to answer the questions and concerns in a follow-up post rather than replying to comments.
We have now created a mailing list for discussions about Qt 5. If you’re interested, please consider subscribing. This will allow us to have better and more structured discussions around Qt 5 than replies to a blog post.
As far as I can see the main concerns can be grouped into a few categories. I’ll try to answer these from my point of view.
QML/JavaScript versus C++
There were a lot of reactions to the statement that we would like to make JavaScript a first class citizen for application development. What we are talking about here is adding another option, not removing C++. Qt is in its heart a C++ framework and this will not change. Quite to the opposite, I see C++ as continuing to be extremely important. There are many use cases where you need to be able to write native code.
So we will continue to add new C++ APIs and improve our existing C++ APIs to give you all the power and options one expects from a native framework. On the non graphical side (ie. QtCore, Network, database, etc.) nothing really changes, and on the graphical side we for now simply add another option.
That being said, we do believe that Qt Quick is the better technology for creating user interfaces. Right now, it still misses some things that developers (esp. on desktops) need. But once we combine what e.g. ListView with the desktop components Jens has been prototyping we do have a pretty good offering for many desktop use cases. But the goal is certainly to improve this over time and we believe this will make everybody’s life easier in creating compelling, modern looking applications.
JavaScript fits in naturally here, as QML is syntactically an extension to JavaScript. Being able to write or prototype parts of your application logic in a scripted language makes many things easier. Remember that this is a choice Qt wants to offer, not a requirement.
As to those asking for other scripting languages: No, we don’t intend to integrate these into Qt or QML. There are many reasons for that, but the simple ones are that it would seriously complicate the Qt Quick architecture and it could slow us down. It’s better to do one language and do it really well than to have something that will in the longer term become unmanageable. Having one consistent offering for developers is also very important here. Multiple scripting languages would just be confusing. JavaScript was chosen, because there are extremely high performant engines out there, because the basics of the language are very similar to what our developer base is used to from C++, and because it has the widest user base of all scripting languages out there.
QWidget and QGraphicsView
The QWidget-based classes will not suddenly disappear. Neither will they suddenly stop working. The exact goal with moving them into a separate module is to ensure our efforts on Qt Quick/QML will not cause problems for these classes. We ourselves intend on continuing to use them in Qt Creator and other applications for the foreseeable future.
We also don’t believe there will be bigger performance problems using these classes once we have re-architected the graphics stack in Qt. They might even become a bit faster on some of the platforms (but that’s a subject for a post of its own). Remote X11 connections is a use case that might get a slight hit. However, the trend in the industry has been moving away from X11 anyway. If there’s a strong interest by a group of people to support this use case, there is an opening to pick this work up and integrate with technologies such as NX. With the Lighthouse architecture, this might be just one plug-in away.
Together this means that there is no need for you to start worrying about having to rewrite the UI of your application in Qt Quick. But again, we believe strongly that it will in the longer term become the better and easier option compared to QWidgets.
Requiring OpenGL
As stated in the post we will require OpenGL 2.0 support for Qt 5 (Desktop GL or GL ES). The reason is simply that this gives a much better common ground for application developers. It also helps to significantly simplify our internal architecture.
We went through all the same concerns that I saw in responses to the blog post about having it as a requirement. In the end we came to the conclusion that it would be the best path forward and should not pose a real problem to most of our users.
On Mac and Windows this should not cause any real problems. Mac OS X has good OpenGL support on all their devices. On Windows we can use the ANGLE library to translate OpenGL ES to DirectX if it turns out that there are issues with native GL support.
On Linux desktops the recurring problem is the quality of some of the drivers. This is unfortunately a bit outside of our scope to get solved. But there is an alternative being worked on by the community that can be as good (or better) than our current Software rasterization: Mesa with llvmpipe. Having GL available consistently on Linux would be a tremendous improvement to the whole Linux stack.
The last part is low end hardware/embedded systems without a GPU. We are seeing that these systems are going away fast and that the price different between SoC’s with and without GPU is becoming very small. Given the great advantages a GPU offers to the achievable user experience and the fallback option to software GL outlined above, we believe it’s better to bet on OpenGL here as well.
Other things
We’re looking into some of the other features that have been asked for (like better C++11 support), but please remember that Qt 5.0 is not the end of the line and we will have 5.x releases where many of these improvements can then come in.
For 5.0 we have to carefully limit the scope of what we do. We will be working hard to not break anything that doesn’t need to be broken, and we’ll also be working to get 5.0 out in a timely manner.
Hi QtWebKittens and other friends!
Now it’s for real: QtWebKit-2.2 has been tagged as TP-1 (qtwebkit-2.2-tp1 @ http://gitorious.org/+qtwebkit-developers/webkit/qtwebkit).
This is our first official TP of QtWebKit-2.2. It’s a new, fresh branch from WebKit trunk and includes a large number of new features, enhancements and fixes when compared to both QtWebKit-2.0 (Qt-4.7) and QtWebKit-2.1.x. QtWebKit-2.2 will be supported on all major desktop and mobile platforms and will be part of the next Qt official release and Qt-SDK.
This tag has passed through basic tests (build + navigation on a few websites) on Linux Desktop, Linux Embedded, Windows XP (32), Windows 7 (32) and OSX 10.6 (Snow Leopard.) We strongly encourage everybody who is interested in QtWebKit to clone the repository and run tests on their environments, reporting any issue on this mailing list, IRC or bugzilla.
Information about the QtWebKit-2.2 release, including a roadmap, build instructions and misc links are on the release wiki: http://trac.webkit.org/wiki/QtWebKitRelease22 (wiki edits are welcome.)
We don’t have pre-build binaries at this moment. If you’re a packager and are able to prepare packages for your favorite Linux distribution or Windows flavor, please consider sharing them with us. ![]()
Happy hacking and happy testing!
Qt Mobility 1.2.0 has been released today. After many hours of design, coding, testing and fixing Qt Mobility 1.2.0 is now officially ready!
This Qt Mobility version brings a number of new features as well as improvements to existing modules. The newest module is Connectivity, which allows easy integration with Bluetooth devices as well as the NFC (Near Field Communication) ready devices. New features, such as result limit hinting for Contacts and network streaming configuration for Multimedia, were also added to existing modules. The minor version bump to 1.2 allowed major modifications such as API additions to QtMessaging, independent object builder for QtServiceFramework, additional sensor support for QtSensors, and many more improvements and bug fixes to all existing Qt Mobility modules. More information can be obtained in the change files as well as through Jira [http://bugreports.qt.nokia.com/].
This Qt Mobility version also improves the coverage of plugins for Qt Quick apps with many of the modules now having QML support. This will allow developers to interact directly with modules through the native QML APIs.
Now that APIs are final, they need to be verified for the platform targets and integrated into the Qt SDK. Please note that Qt Mobility 1.2 is created for the upcoming Symbian and MeeGo devices. Targets will be made available accordingly with updates of the Qt SDK.
You can download the source and binary packages here:
http://get.qt.nokia.com/qt/add-ons/qt-mobility-opensource-src-1.2.0.zip
http://get.qt.nokia.com/qt/add-ons/qt-mobility-opensource-src-1.2.0.tar.gz
MeeGo packages: http://download.meego.com/live/devel:/qt-mtf/Trunk/
- sudo zypper addrepo http://download.meego.com/live/devel:/qt-mtf/Trunk/ devel-qt-mtf
- sudo zypper refresh
- sudo zypper install qt-mobility
- sudo zypper install qt-mobility-examples
You are also able to retrieve the source directly from the qtmobility public git repository [http://qt.gitorious.org/], the ‘v1.2.0′ tag is now present. The Qt Mobility 1.2 documentation has also been updated [http://doc.qt.nokia.com/qtmobility-1.2/index.html].
Thanks to everyone for waiting patiently and please enjoy.
Release Engineering
Right now the HTTP Archive analyzes the world’s top 17,000 web pages gathering information about the site’s construction. It’s interesting data, especially for a performance junkie like me. Subsetting the data for comparisons is a challenge given the numerous ways this long list of URLs could be sliced.
This past week I added two new subsets: Top 100 and Top 1000. So now when you go to the Trends and Stats pages you have the following choices:
- All – The entire set of ~17K web pages but the exact URLs might vary from run to run due to errors, etc.
- intersection – The set of URLs for which data was gathered in every run. Right now there are ~15K that have data in every run so far. This is great for apples-to-apples comparisons.
- Top 100 – The world’s Top 100 based on Alexa.
- Top 1000 – The world’s Top 1000 based on Alexa.
There are some amazing differences between these sets of sites. To make it easier to explore these differences I added the Compare Stats page. You can pick two different runs and sets of URLs and see their stats charts side-by-side. I’m amazed at some of the differences between the Top 100 and Top 1000 for stats from the April 30 2011 run:
- total bytes downloaded: 401 kB vs 674 kB (Top 100 vs Top 1000)
- use of jQuery: 27% vs 43%
- use of Google Analytics: 25% vs 52%
- pages using Flash: 35% vs 49%
- resources with no caching headers: 26% vs 41%
Top 100 Bytes Downloaded
Top 1000 Bytes Downloaded
One takeaway is that stats that are critical for good performance (bytes downloaded, caching) worsen when we expand from the Top 100 to the Top 1000. What happens if we look further down the tail at “All”? I encourage you to check it out yourself and see how your site compares.
Last week, I blogged about the framework of levels we’re proposing to use for Qt in the context of Open Governance and future development. I said then that we had also taken the time to look into what code we have in Qt and decide what level it should be in. Today, I’d like to share the list that was the result of that work.
First of all, please understand that this applies to future releases of Qt, most importantly Qt 4.8 and 5.0. Since we never add or remove features to existing and current release series, the framework doesn’t apply there, except as in what kind of bugfixes we might be accepting. For example, a bugfix for a corner case could potentially introduce regressions, so the community needs to take into account the quality level of the code and the ability to execute regression and release testing before accepting said commits. Moreover, it’s also possible to obtain differentiated support for past releases from other companies, especially from Digia. This is however outside the scope Open Governance.
Second, the list below contains only what is not in states Active or Maintained. We felt that the most important thing to do right now was to be really honest about what we (Nokia) are working on and what we’re not working on. This is especially important for people reporting bugs, since it is unlikely that we will fix low-priority issues in subsystems that are in the Done state or lower. What’s not on the list below is then under the state “Active” or “Maintained”, like for example QtDBus. Also note that this list focuses on Qt only, which is the older codebase, containing more legacy.
This first publishing of the list is, by necessity of a blog, a static list. In reality, when Open Governance and Qt 5 work kick in, this list will be very much dynamic, so I’ll be importing it into the Qt Developer Network to be edited and kept up-to-date. So I want to be very clear that the list can change and modules may go up or down in the Level scale. The only things that should not happen are: a) sacrifice quality and b) take Qt in the wrong direction (backwards). As an example of the latter point, functionality that gets Removed from Qt should not be brought back: we don’t want to support IRIX, the non-Unicode Windows versions or Microsoft Visual Studio 6.0.
Finally, let me remind you that Done is not Deprecated! Done really means “stability and zero regressions are the most important things, so we are not adding features and we are not working on improving performance, but it’s fine to use this code”.
The list
Modules
- ActiveQt
Overall module state: Done
New Maintainer Required - Phonon
Overall module state: Done inside Qt, Maintained outside of Qt
Reasoning: QtMultimediaKit recommended instead; development of Phonon continues and is maintained outside of Qt, by the KDE community. - qmake
Overall module state: Done
Reasoning: stable code, we don’t recommend bringing it up in the level list. Research into a future, more modern buildsystem has started.- XCode integration
State: Deprecated
New Maintainer Required.
- XCode integration
- Qt Designer
State: Done
Reasoning: Qt Quick is recommended for developing UIs from now on, so the new Qt Quick Designer should take over the capabilities of the classic Qt Designer. - Qt3Support
Overall module state: Deprecated
Reasoning: Qt3Support was provided as a porting layer from Qt 3, so the recommendation is to finish the port. Qt3Support will be Removed in Qt 5. - QtCore
Overall module state: Active/Maintained- QFileSystemWatcher
State: Deprecated
Reasoning: flawed design, a replacement is required. We’re open for ideas in that area. - Abstract file engines
State: Deprecated
Reasoning: flawed design, this is the wrong level to provide a virtual filesystem, so we don’t recommend taking this over. In Qt 5, this functionality will be Removed.
- QFileSystemWatcher
- QtDeclarative
Overall module state: Active/Maintained- Graphics view support (i.e., QML 1.x)
State: Done
Reasoning: QML Scene Graph-based QML 2 is recommended and will become available in Qt 5.
- Graphics view support (i.e., QML 1.x)
- QtGui
Overall module state: Active/Maintained
More information about reorganisation of this module, see the Qt 5 blog.- XLFD support for the font system
State: Deprecated
Reasoning: this is obsolete functionality in X11 as modern systems use client-side fonts; doesn’t affect other platforms. - Graphics Effects
State: Deprecated
Reasoning: flawed design, we don’t recommend taking maintainership of this code. - Graphics View
State: Done
Reasoning: stable code for which stability and reduced risk of regressions is more important; we don’t plan on adding more features. - Implicit native child widget
State: Done
Reasoning: flawed design, we don’t recommend taking maintainership of this code.
Note: widgets with explicit native window handles, like Direct3D view, will still be supported. - Printing support
State: Done
New maintainer required.- Postscript support – Deprecated
Reasoning: obsolete support, PDF is enough nowadays.
- Postscript support – Deprecated
- QPainter
State: Done
Reasoning: stable code for which stability and reduced risk of regressions is more important; we don’t recommend bringing the maintainership level up.- Raster and OpenGL (ES) 2 engines – Maintained.
- Other engines – Done and New Maintainer required.
- QPainterPath’s “set” operations
State: Deprecated
Reasoning: flawed design, we don’t recommend taking maintainership of this code. - QPicture
State: Deprecated
New maintainer required. - QSound
State: Deprecated
Reasoning: better solution available in QtMultimediaKit. - Styles
State: Done
Reasoning: stable code for which stability is extremely important, so we don’t recommend bringing the maturity level back up; Qt Quick-based development is expected for the future of UIs and, with it, Qt Quick-based theming and style possibilities.- Motif and CDE styles – Deprecated
Reasoning: obsolete.
- Motif and CDE styles – Deprecated
- Stylesheets
State: Done
Reasoning: stable code for which stability is extremely important, so we don’t recommend bringing the maturity level back up; Qt Quick-based development is expected for the future of UIs and, with it, Qt Quick-based theming and style possibilities. - Widget classes like QPushButton, QLineEdit, etc.
State: Done
Reasoning: stable code for which stability and reduced risk of regressions are important, so we don’t recommend bringing the maturity level back up; Qt Quick-based development is expected for the future of UIs, with Qt Quick Components. - XIM support
State: Deprecated
Reasoning: flawed design, we don’t recommend taking up maintainership of this code.
- XLFD support for the font system
- QtNetwork
Overall module state: Active/Maintained- QHttp and QFtp
State: Deprecated
Reasoning: replaced by QNetworkAccessManager; we welcome research supporting the filesystem functionality of FTP that is not currently present in QNetworkAccessManager. In Qt 5, these classes will be Removed.
- QHttp and QFtp
- QtScript
Overall module state: Active/Maintained- QScriptEngineAgent and related classes
State: Deprecated
Reasoning: flawed design, being replaced by a better design.
- QScriptEngineAgent and related classes
- QtSql
Overall module state: Done
New maintainer required. - QtSvg
Overall module state: Deprecated
New maintainer required
Reasoning: SVG Full (as opposed to SVG Tiny) functionality available in QtWebKit, which should be used instead; we welcome research for a replacement for the SVG-generating code. - QtWebKit
Overall module state: Active/Maintained- QWebView and QGraphicsWebView
State: Done
Reasoning: moved to a separate library in Qt 5, the main entry point for web views will be the Qt Quick-based “webview” component.
- QWebView and QGraphicsWebView
- QtXml
Overall module state: Done
Reasoning: QXmlStreamReader and QXmlStreamWriter are recommended instead and are located in the QtCore module. - QtXmlPatterns
Overall module state: Done
New maintainer required.
Functionality
- Carbon support in Mac OS X
State: Done
Reasoning: Cocoa support is available and Carbon cannot be used to create 64-bit applications. We’d like eventually to Deprecate and even Remove this functionality during the Qt 5 lifecycle, as this code is currently a maintenance burden (like Windows non-Unicode was). - HP-UX, AIX and Solaris support
State: Done
New maintainer required. - Old Qt solutions archive
State: Deprecated
Reasoning: old code, not maintained anymore. - Bearer Management inside Qt Mobility
State: Deprecated for now, will probably be Removed in Qt 5.
Reasoning: the copy of the code maintained in QtNetwork is the recommended interface. - Qt Multimedia inside Qt
State: for 4.8 it is Deprecated, in Qt 5 it is replaced by the Qt MultimediaKit copy with the modularisation of Qt. - Phonon copy inside Qt
State: Done
Reasoning: a separate release of Phonon, with its own version numbers, is available and can be used instead; the copy inside Qt will not be updated further. - Qt WebKit copy inside Qt
State: Deprecated
Reasoning: a separate QtWebKit release, with its own version numbers, is available and should be used instead with Qt 4.7 and 4.8, for those looking for updates. In Qt 5, the separate release is reintegrated through the Qt modularisation. - QWS (a.k.a. the current Qt for Embedded Linux)
State: Done for Qt 4.8
Reasoning: the new Lighthouse-based architecture is recommended for new features and new platforms. - Static builds of examples and demos
State: Removed
Reasoning: this is not maintained or checked and the Qt binary builds are always dynamic. Static builds aren’t required for reading the source code and learning Qt, they are never deployed to devices. - Static builds on Mac, Windows and Embedded Linux
State: Done - Windows CE port
State: Done
New maintainer required. - WINSCW port
State: Done
Reasoning: old and buggy compiler, required only for Symbian simulator builds, should be replaced with a new technology once that is available.
Changing the list
As I said before, this list should live in the Qt Developer Network wiki, where it will be updated as the states change. So how do they change?
The state of a given functionality or module is the choice of the maintainer of that code, who is the ultimate responsible for the quality. So the decision on whether new features and the extent of what other kinds of changes should be accepted or not is also the responsibility of this person. Therefore, to change the state, you have to either convince the current maintainer or become a maintainer yourself.
In that light, we have been discussing with current contributors to Qt and asking them whether they would like to volunteer for maintainership of anything. Digia has already volunteered find someone to maintain the AIX and Solaris ports and KDAB has done the same for Windows CE. Becoming a maintainer for something inside Qt shouldn’t be too hard: it takes time and dedication, because it comes with a responsibility. (For that reason, we’d like people to prove that they can do it first, such as by maintaining a branch first)
More discussions on this should begin with Open Governance and the Qt contributors Summit.
This morning during my talk at Mobilism I announced the HTTP Archive Mobile – a permanent repository for mobile performance data.
Ever since I announced the HTTP Archive (based on data gathered from Internet Explorer using WebPagetest) people have been asking, “What about data gathered from a mobile device?” It’s a logical next step. Thankfully, Blaze.io came on the scene a few months ago with a solution, Mobitest, for connecting to mobile devices and gathering this kind of information. Blaze.io has been doing great work in the area of mobile performance. I met Guy Podjarny, Blaze.io’s CTO, a few months ago and we’ve had several discussions about performance. A few weeks ago Guypo and I decided to start working on the HTTP Archive Mobile. It required a few changes on his side and a few on mine, but we quickly got it up and running and have been gathering data for the past week in anticipation of this announcement.
Here are some interesting comparisons of the Top 100 web pages from desktop vs mobile:
- Total bytes downloaded is 401 kB for desktop vs 271 kB for mobile.
- Total size of images, scripts, and stylesheets is smaller on mobile, but size of HTML is bigger. This is likely from JavaScript, CSS, and images being inlined in the HTML document – a good sign for performance.
- The New York Times is in the top 5 pages with the most JavaScript with 230 kB on desktop and 326 kB on mobile – 94 kB more JavaScript on mobile than desktop.
- The percentage of sites that have at least one redirect is 54% for desktop compared to 68% for mobile. Many websites (Facebook, Yahoo!, Bing, Taobao, etc.) redirect from www.* to m.* on mobile browsers. But it would be better to lower the mobile percentage since redirects inflict an even greater delay on mobile devices.
This is just the beginning. The HTTP Archive Mobile is currently in “Alpha” mainly because we’re gathering data for just the Alexa Global Top 100 sites and only have a week’s worth of data. More URLs and more data are on the way. Already there are valuable takeaways from what’s there, and these will increase as the number of websites and length of time grow. Take a look for yourself and add comments below on anything you find that’s interesting or puzzling. And thanks again to Guypo and Blaze.io for making the Mobitest framework available for collecting the data.
On Monday, when he exposed his thoughts about Qt 5, Lars mentioned that he would be at the MeeGo Conference in San Francisco later this month. At the time, he hadn’t yet submitted his session proposal.
I’m happy to announce that the program committee has decided to accept his submission, as well as the BoF session I proposed to plan for Qt 5. If you’re coming to the MeeGo Conference or if you’ll be in the San Francisco area then, come join us for the first discussion on how to get a Qt 5 release in a year’s time.
Another interesting session I plan on attending in San Francisco will be the one on using Wayland for the MeeGo Tablet, by Kristian Høgsberg and Liu Xinyun. These two are just a few examples of new sessions that the program committee has accepted as part of the “Late Breaking News”, improving even more the program which was already great.
We have several trolls presenting at the event:
- I’ll be giving my usual update on the Qt Open Governance;
- Min will be talking about the APIs in Qt Mobility 1.1 and 1.2;
- Lars’s session, of course, right after Min’s talk;
- Aaron McCarthy will be talking about how to use NFC in Qt applications;
- Jørgen has been preparing a presentation on Qt running on Wayland and a Wayland compositor written with Qt (and I saw a sneak preview of his presentation today at the office — not to be missed);
- Donald will present the Qt Media Hub demo project;
- Frederik will talk about Accessibility;
- Plus several BoF sessions… [1] [2] [3] [4]
On the second day, there’s a full track on developing applications using Qt Quick, in parallel to a full-day track on how to use the SDK for creating and deploying applications.
Other talks I would like to see:
- Ariya’s (ex-troll) always fascinating updates on what you can accomplish with WebKit;
- Cross-device application development by two friends from INdT (Caio and Anselmo);
- Orange’s and Accenture’s plans for success using MeeGo;
- Marius’s (ex-troll) tales from the trenches on how he’s making money creating apps;
- Why Peter Winston (from ICS) bet his company on Qt and MeeGo;
- MeeGo running on a quite different kind of vehicle;
I clearly won’t be able to see all the talks… In Dublin, I spent most of my time introducing people to other people and I don’t expect San Francisco to be any different. Unlike Dublin, I don’t expect to have to walk too far between rooms, but there will be a fair bit of going up and down stairs.
Still, I think we have a great line-up of talks. I’m hoping this will please everyone in the audience. At the program committee, we spent a great deal of time reading every single submission and trying to make a conference for everyone in the MeeGo community.
Now, I think I should get started on my presentation. The email I received on Wednesday says I should upload it by the end of today in San Francisco (9 am tomorrow my time).
Happy Friday 13th!
Hi Folks,
Here are my slides from the IBM big data symposium. This was a good event. IBM announced a new release of their Apache Hadoop based Big Insights platform. It is great to hear their commitment to Apache. Yahoo was there talking about our experiences and uses of Hadoop. I got a lot of questions about why we invest in Hadoop, so let me point you back to my post on that and our commitment to Apache Hadoop. (http://yhoo.it/e8p3Dd and http://yhoo.it/i9Ww8W)
Thanks, E14
BitCoin
I’ve spent the last few months researching BitCoin, which for those who don’t know is a p2p currency system that is (sort of) de-centralized and certainly delineated from any government control or intervention. There’s some pretty sophisticated technology behind it which ensures true scarcity (the fundamental issue for any economic model) but it also enjoys some impressive security features and seems to be incredibly solid on the privacy front.
What is particularly interesting is that you can essentially ‘mine’ BitCoins by running complicated algorithms on your computer (or servers) which is how new BitCoins are created (although that also creates an inflation factor in the market of course). If you are an economics wonk you’ll cream yourself over BitCoin. There’s already a ton of interesting thoughts on the cost of electricity to ‘mine’ a successful BitCoin chain vs the value of the unit of currency, plus numerous trading markets etc
Many wonder if BitCoin is legal – but that’s a superfluous question because it’s totally uncontrollable and the distributed nature means it doesn’t exist in any one jurisdiction. Certainly if it becomes a way for terrorists and organized crime to launder money then I guess we’ll really see governments stepping in.
Anyway, I’m seeing signs that BitCoin is about to move out from being an underground project and into mainstream focus over the next few weeks or so. It will be interesting to see some sunlight on it from existing financial world as I am still on the fence as to whether it is a folly or the early start of something significant.
PlayStation Network
Having (apparently) fixed their security problems, Sony have powered up the servers powering the PlayStation Network and reopened it to users. I asked on Twitter whether anyone would actually be jumping back in.
My guess is that kids who don’t care about the issues, and probably using their parent’s credit card anyway, will get straight back on there. As will die-hard gamers who prefer human-based competition as PSN is their only option.
But the growth and strategic opportunity for Sony Playstation Network is the ability to deliver services like Netflix, IPTV, games on demand, etc. The problem is consumers have many choices there, with competition not just from rival Microsoft XBox Live (and Nintendo’s next gen console) but Google TV (if they get their act together), Apple TV (ditto), Roku, Boxee, etc. While the same security problems could theoretically be faced by those vendors too, the bottom line is that they haven’t had those problems. And consumers are rightly worried about Sony’s security.
Google IO
Google IO happened on Tuesday and Wednesday this week – I’ve attended everyone since they began in 2007.
It costs ~$500 to attend Google IO (assuming you can get a ticket). The event takes place at the Moscone West conference center, where Google is required to use the conference center’s in-house catering company for all food and beverage. The rumor during the rounds at the event was that in order to meet Google’s own standards for quality of food, the ticket price for the event barely covered the cost per attendee for food (two lunches, evening reception and snacks). Frankly, those numbers add up to me. The food was the best I’d ever had a conference, and the logistics of feeding 5000 people is just insane.
But my point isn’t about the food. The point is that every year Google puts on one of the most slickest and highest quality conferences in the conference calendar, at one of the most expensive conference venues in the country. And it bankrolls with the ticket price being a mere drop in the budget. I’m guessing Google easily sinks $10m+ into those two days. It might even sink $50m for all I know – I have no idea what it costs to rent the Moscone West center for 2 days + setup and tear-down.
The announcements themselves were exciting and refreshing – like Google open sourcing all of its hardware accessory development unlike Apple which requires accessory makes to get their devices certified (ie pay $$$). But that’s for another post.
The point is I can’t think of another company that makes the level of investment into Developer Relations that Google makes. It’s really quite incredible.
Call to action
My weekend musings are based on things I observe and comment on during the week over on various social sites. If that interests you, make sure you follow me on Twitter and Quora, and keep across my Hacker News comments page.
Photo: my partner Violet outside of Google IO with the Android plushie Google gave me during my registration
As you may have read, lately we have been working on accessibility on Linux. Since Gnome has a mature accessibility framework, it was logical to contact them for advice and sharing of infrastructure. I was invited to attend the “hackfest” dedicated to improving the current framework and making it possible to use the existing code with other toolkits (read Qt).
The ATK/AT-SPI Hackfest took place in A Coruña, Spain, and was hosted by Igalia. An enthusiastic group of 11 people sat down to discuss and fix issues. We had Gecko/Firefox, Webkit, GTK, Clutter and Qt/KDE present on the toolkit side, the ATK/AT-SPI people for the actual framework and the Orca screen reader team. It was great to see this group of people work towards a common goal.

While Andrey is busy implementing partitioned replication infrastructure code for the PHP replication and load balancing plugin(PECL/mysqlnd_ms), I continued my search for ideas to steal. Mr. Robert Hodges, I’ve robbed the idea of a service level and caching.. If an application is able to function with stale data read from a MySQL replication slave, it can also deal with stale data from a local cache. The replication plugin (PECL/mysqlnd_ms) could, in certain cases, populate the query cache plugin (PECL/mysqlnd_qc) for you and read replies from it.
In the blog posting "Implementing Relaxed Consistency Database Clusters with Tungsten SQL Router" Robert explains from a theoretical standpoint why his product allows application developers to set quality of service. The service level defines if eventual consistency is allowed or not. If so, the system is not required to return current data to all clients. Stale data may be served.
If using MySQL replication for read scale out, dealing with stale data is a standard task. Slaves may lag behind the master and have not the latest updates. Applications must be able to function with stale data. Given that the service level allows stale data, one can replace one stale data source with another. One can replace a MySQL slave possibly lagging behind with a local (TTL) cache.
| Any PHP MySQL application | |||
|---|---|---|---|
| | | | | ||
| consistent | eventual consistent | ||
| | | | | | | |
| PECL/mysqlnd_ms | |||
| | | | | | | |
| | | Cache (PECL/mysqlnd_qc), TTL = 2s |
| | |
| | | | | ||
| Network | Network | ||
| | | | | ||
| MySQL master | MySQL slave, lagging 4 seconds |
||
All that needs to be done is combining PECL/mysqlnd_ms, the replication and load balancing plugin, with PECL/mysqlnd_qc, the query cache plugin. Of course, this should be done on the C level, inside the extensions. Ideally, applications using the combination of the two plugins would not need to bother of populating the cache and deciding when to read from it.
mysqlnd_ms_set_service_level($mysqli, MYSQLND_EVENTUAL_CONSISTENT);
$mysqli->query("SELECT id FROM test");
$mysqli->query("SELECT id FROM test");
The replication plugin would just know from the service level that queries may be served from the cache. For example, it could automatically decide to cache the SELECT from the example above. Could… this is brainstorming. No promises on features and time lines. I’m fishing for feedback.
Dream on, read on at https://wiki.php.net/pecl/mysqlnd_ms#raw_bin_ideas_rfcs. Feel free to edit the wiki page…
I just finished processing the “May 16 2011″ run for HTTP Archive and HTTP Archive Mobile. Here are some interesting observations.
HTTP Archive (desktop)
As of today there is six months of historical data in the HTTP Archive. As a reminder, the world’s top ~17K web pages are being crawled. Since the actual URLs in the list at any given time can change, I start by looking at the trends for the intersection of URLs. This means the list of URLs is exactly the same for each data point. The total transfer size continues to grow: up 8 kB (1.1%) since the last run on Apr 30 2011, and up 53 kB (8%) from the first run back on Nov 16 2010.
Looking at the Image Transfer Size chart we see images are the main cause for this growth – up 5 kB from Apr 30 2011 and 43 kB since Nov 16 2010. Conversely, the transfer size of Flash has steadily dropped from 81 kB on Nov 16 2010 down to 71 kB on this last run on May 16 2011 – a 12% decrease.
I used the new Compare Stats page to compare the interesting stats from Nov 15 2010 to May 16 2011. I again chose the “intersection” of URLs to get an apples-to-apples comparison. In addition to the transfer size increases mentioned previously, we can see the growth of various JavaScript libraries on these ~17K web pages over the last six months:
- jQuery is up from 39% to 44%
- Facebook widgets have grown from 8% to 13%
- Twitter widgets increased from 2% to 4%
The “Pages Using Google Libraries API” chart shows adoption of this CDN has grown from 10% to 13% since Nov 16 2010. This means even more cross-site caching benefits for users.
HTTP Archive Mobile
The HTTP Archive Mobile just launched last week. There’s only two weeks of historical data gathered on just the top 100 web pages, so there aren’t any major shifts in the trending charts just yet. Nevertheless, comparing the mobile stats between the last two runs, May 12 and May 16, has some interesting revelations.
The “Pages with the Most JavaScript” chart shows that the Twitpic page’s JavaScript grew from 308 kB to 374 kB – a 66 kB (21%) increase. This highlights the value of HTTP Archive (both desktop and mobile) to individual websites as a way to track performance stats and have a permanent history.
There are some other interesting stats at the individual website level, but we’ll need a few months of mobile data before we can draw conclusions about any trends. In the meantime, check back here around the 15th and 30th of each month to see the latest runs and discover new observations about how the web is running.
This morning we crossed a milestone: delivering a sustained average of 2,000 page views per second. Peak traffic is even higher. Once upon a time we could watch the logs fly by in our terminal windows to get a sense, almost like the guy in the Matrix ("...all I see now is blonde, brunette, redhead..."), of what was happening on the CloudFlare network. Today we generate over a million log lines a minute, so viewing the traffic that way doesn't work anymore. Instead, in the office we have a display with a globe showing global traffic (human, crawler, and threat) coming in to the various sites on our network. And, yes, for those of you who noticed the resemblance, we intentionally aped the look and feel from a display in the lobby of another company just down the road in Mountain View that not too long ago was a startup themselves.
This is a guest post written and contributed by Jon Dahl, CEO of Zencoder, a Rackspace Cloud Tools partner.

Zencoder is happy to announce that we have added native support for Rackspace Cloud Files storage.
Zencoder is the performance leader in scalable, fast, high-quality cloud-based video transcoding. It was launched in 2010 to solve the most complex part of the video publishing process: transcoding. Every video published, anywhere, needs to be transcoded (or encoded). But for many years, transcoding technology was unreliable, slow, and difficult to work with. Zencoder was founded to solve this problem. Cloud-based video transcoding brings all the benefits of cloud-based architecture to video – on-demand scalability, API-driven architecture, and no upfront cap-ex or ongoing maintenance required to stay current with changes in technology.
We’re big believers in cloud computing, for everything from storage to services (like video encoding). Rackspace has a great cloud platform, and we’ve had a number of customers who have wanted to use Cloud Files in conjunction with Zencoder. But in the past, this hasn’t been easy – you could get files to/from Zencoder using FTP, SFTP, and HTTP, but integrating with Cloud Files was difficult.
Now we’ve made this simple. If your original files are stored on Cloud Files, just give us a URL that starts with “cf”, like this:
{ “input” : “cf://username:api_key@container/object” }
And if you want Zencoder to deliver your finished files to Cloud Files, do the same thing:
{ “output” : { “url” : “cf://username:api_key@container/object” } }
See our documentation for more info.
______________________________________________
Cameron Nouri, from the Rackspace Business Development team, is your connection to the Rackspace Cloud Tools Partner Ecosystem. If you have developed solutions or services that makes life easier for people to take advantage of the cloud he would like to talk to you! You can contact Cameronany time to learn more about this unique program and the benefits for your business.
Over the past few years, there are a few concepts and programming patterns that have muscled their way into the hearts and minds of PHP developers from other languages and programming communities. These concepts range from the MVC application architecture as well as various modeling techniques (think ActiveRecord and Data Mapper), to a pure shift in the way we think about application architectures, like aspect-oriented programming (AoP) and event-driven programming. Perhaps it’s because PHP has been adopted at an enterprise level thus increasing the demand for what developers might call enterprise quality programming patterns, or perhaps it’s simply because of PHP’s ever evolving object model that makes new things possible. After all, who doesn’t like new shiny things? Whatever the reason, one of the newest concepts (at least over the past 3 years or so) that has emerged as one of our heated topics of debate is how to manage object dependencies. Interestingly, the argument of how to manage dependencies is generally named by the solution which it’s proponents give as the solution: dependency injection (the abstract principle is actually called Inversion of control).
In any circle of developers that are of the object-oriented persuasion, you’ll never hear an argument that dependency injection itself, is bad. In these circles, it is generally accepted that injecting dependencies is the best way to go. Injecting object dependencies in PHP looks like this:
// construction injection $dependency = new MyRequiredDependency; $consumer = new ThingThatRequiresMyDependency($dependency);
That’s basically it. There are many variations of this: setter injection, interface injection, call time injection, in addition to the above mentioned constructor injection. These are all valid ways of injecting the dependencies into the consuming object. Ultimately, the goal here is to avoid this:
class ThingThatHasAnExternalDependency
{
public function __construct() {
$this->dependency = new ARequiredDependency;
// or
$this->secondDependency = ARequiredDependency::getInstance();
}
}
The above code is an example of a violation of the Hollywood Principle, which basically states: “Don’t call us, we’ll call you.”.
Yet, this is not the heart of the argument. Perhaps it was 4-5 years ago in the PHP community, but it’s not anymore. The heart of the argument is not should we be doing it, but how do we go about doing it.
This article is not about the intricacies and implementation details of DI containers and DI frameworks. It’s also not about the various ways and means of injection dependencies into other objects, and which method might be better. In fact, this article has no opinion if injecting dependencies is even good for you or your application. This article is an exploration how adopting any DI framework for PHP affects the lifecycle of a project, both the code as well as the developer, team or organization that is constructing it.
A Brief History of Dependency Management In PHP
It is important to know why PHP is as popular as it is, after all, it’s this popularity that DI Frameworks fight against for adoption inside a PHP application framework. To understand PHP’s popularity, history, and evolution, let’s look at this code:
// these 6 lines actually represent 5 different web centric "langauges"!
include_once 'includes/config.php'; // ultimately there is a mysql_connect() call in here somewhere
include_once 'templates/header.php';
$rows = mysql_query('SELECT * FROM users'); // magically uses the mysql_connect() resource
foreach ($rows as $row) {
echo '<div class="user-row"><a href="/delete-user.php" onclick="someJSFunction();">' . $row['username'] . '</div>';
}
include_once 'templates/footer.php';
From the beginning, we’ve been trained into thinking that our dependencies are magically managed. As you can see above, the mysql_query() function, while it will accept a connection resource, does not require it. In fact, if it’s not supplied, it will use the first open mysql connection it can find inside the PHP runtime. Assuming that the above mentioned delete-user.php script is part of a larger collection of PHP scripts, which we will call “the application” … it is important to note that even this script itself is pulling in it’s dependencies instead of them being injected. For all intents and purposes, the config.php, header.php and footer.php are all dependencies of this script, much like other scripts similar in nature to this delete-user.php. To sum it up, if there is a new dependency that is now required by the business logic portion of this application (ie: the lines between the header and footer), they now have to be introduced to all scripts in this application. This does not exactly adhere to the DRY principle.
But, let’s take a step back and look at this snippet of code from the organizational perspective. To do this, we must first understand the various phases of the code’s lifecycle within any organization. For the purposes of this example, let’s assume that from idea to production, code will go through the following phases: development, build, deployment, to application start-up (in production). If this were a C/C++ or Java project, code will have been written (developed), it will have been compiled (built), then it would have been packaged or some deployment tool’s process invoked (deployed); it them would have been run (executed via some startup script, or executing a binary.) PHP, and Perl at the time, achieved all of the same objectives but in fewer steps making it a wildly popular platform for highly iterative web projects. This same application in PHP would have been coded in some text editor (developed), and FTP’d up to a production server (deployed). You’ll notice that it neither had to be built/compiled, or started on the server since the target, Apache, was already running with PHP embedded into it. For all intents and purposes, a cheap and easy FTP tool was both the build and deployment tool for this application’s lifecycle.
It was this simplicity that made PHP the popular choice for web applications. This popularity was attained because the simplicity of the PHP platform allowed for two extremely important facets of development to emerge: the idea of building an application became approachable to even the novice individual, and without all the cruft that came along with the application lifecycle, building and deploying applications in PHP increased PHP’s “fun-ness” factor.
While this style of building applications allowed for a proliferation of PHP applications to be developed, there was in fact a negative side to be revealed later in time. As applications quickly grew, their ability to be maintained decreased. We give them the name “Spaghetti code”, and for all the right reasons. Objects, if they were even being used, were generally wrappers around procedural functionality. So object dependency management wasn’t even a consideration for most developers. Looking back, perhaps it was this original simplicity that allowed developers to create applications without even having to know what a dependency was or how to find it. In any case, as these applications grew uncontrollably, maintaining them and hacking them started to lose the PHP fun factor exponentially.
A Brief History of DI Frameworks
As PHP developers started identifying the problems with their Model 1 applications, they started looking for solutions in other programming communities. At this time, the Java community was still heavily rooted in the enterprise/software development/software engineering world, and problems such as dependency management already had some interesting solutions. Most notably, there was the Spring Framework, who’s primary facility for dependency management was a component called the IoC Container, or the Inversion of Control container. This container managed the fully lifecycle of object creation using callbacks. This meant that you no longer has to use the “new” keyword (the same new keyword in PHP). Also, it wired the dependencies for you at instantiation time. This meant that you no longer had to concern yourself with how dependencies were injection; be it through the constructor, properties or setter methods. The Spring Framework was one of the first frameworks that encouraged the use of definition files to manage the knowledge required to wire all your dependencies together. True to form in the Java community, these definition files were created in XML.
As it might seem, this is indeed a deviation from the PHP philosophy that had made PHP so popular. PHP allowed you to write the most minimal amount of code to complete your application. In the Java/DI world, particularly with the Spring framework, you had a much richer application lifecycle. Not only were you developing code for your appliation, but you were creating code about code to manage code. This is known as meta-programming. In addition to this meta-programming that was going on, you also now had this compilation phase required by the Java platform which was generally tucked away inside your build time tasks. Moreover, this application had to be deployed (there were generally tools for this too), and (for good measure), due to the platform, your application had to be started. Needless to say, this application lifecycle might seem heavier, for lack of a better term, to the average PHP developer.
Since then, several frameworks have cropped up that sport some kind of dependency management. Before this technique was picked up in PHP, they were all heavily rooted in the Java and .NET communities. A quick google search will return a few notable names like PicoContainer, Spring.NET, Unity, Butterfly and google-guice to name a few. These frameworks attain popularity since they attempt to ease some of the burdens that DI places upon the developer whether it be by using reflection to create definitions, or even adding an annotation system so that DI definitions can be written inside the code they are set to manage.
DI and PHP
To understand the attainability of having a dependency management framework for PHP, one should first understand how the counterparts in Java and .NET rely upon their respective platforms to do certain jobs. For a quick reference, see the images from this blog post. One of the more important facets to remember is that the expected application lifecycle of a Java/.NET application is much richer. You are expected to have build-time tasks. You are expected to have deployment tasks. And, generally, your application understand the difference between being in development, staging and production – so it can adjust how it runs accordingly. Moreover, the platform itself has facilities in place that aid the developer both in development time with code generation as well as in production.
PHP never expects or facilitates the usage of any kind of build-time tasks. PHP also does not have any kind of built-in annotation support (a meta-programming technique), nor does it have any kind of application scope or per-application memory space. What does this mean for someone who is creating a DI container? Let’s explore.
Development Time
General speaking, any time you are writing, altering or just shifting code around, you are in development mode, your application should be running in a development environment. The structure of your application’s classes, functions and files within the filesystem is probably changing with each time you click save. Dependency management systems require knowledge of your code in order to effectively do their job. This knowledge generally comes in the form of some kind of definition.
This definition can be created by hand, by the developer, generated at runtime by some application hooks, or generated with the use of a special tool. If this is done by hand, a developer is required to explicitly map the various functions/methods that will need to be called in order to inject a particular object dependency. The more dependencies you have, the more verbose this definition might become.
A better route would be to generate this definition file, after all, the code you’ve written, if written correctly will self-describe its dependencies. There are two options for generation, manual and automatic. An example of manual generation would be a developer giving a command line tool the minimal information it needs to be able to go parse your code, figure out the dependency map for itself, and generate some kind of definition to be used during runtime. Minimal information might include some kind of seed information like where to find your classes or perhaps what filters to use when inspecting classes. Sometimes, these tools might make use of special interfaces (also called interface injection) to understand that their purpose is to describe the various dependencies of the class implementing said interface. Another approach might be to utilize special annotations on classes and class methods that describe the various required and optional dependencies and how they are to be injected.
The same techniques employed in this manual approach could also be put to use in an automatic approach. In automatic approach, imagine this same command line tool from the manual approach was now a service of the application itself. While in development mode, it would run as often as need be in order to determine if code changes have happened. If they have, the service would regenerate the dependency definition file so that the rest of the application can utilize the dependency definition inside the DI container available to the application during runtime.
There are a couple of concerns that are specific to PHP with regards to dependency management. Since PHP is a share-nothing architecture with no application level memory, this definition would need to be loaded and parsed and put into memory on each request. The larger the dependency tree that you track, the larger the memory footprint of the dependency definition graph. Furthermore, since this definition has to be loaded on each request, if it is in a non-native format (meaning anything other than PHP code), there are certain costs with converting this format, be it XML, YAML, JSON, or INI to the in-memory structure that the dependency management container requires. What’s more, the PHP platform does not keep track of file changes. So without some kind of user-land tracking, it is hard to know what files during development have changed. Thus, your dependency management system, if it’s taking an automatic approach, would have to rescan the filesystem for changes upon each request during development – which has it’s own consequences.
Deployment Time
When one is done writing code and is ready to push this application into production, the act of pushing this application is called deployment. The mode for this application is now considered “production”. In production, you can be sure that the structure of your code is stable and will not change, thus your dependency graph is now safe from changes too. Since this is the case, there is no longer a need to keep updating and regenerating this dependency definition file like you were during development.
Even though the definition is no longer changing, there still is the concern about how expensive it is to load this definition each request. Naturally, the cheapest form of definition would be a PHP array or structure describing the definition that can then be loaded in-memory. Other file types like XML, YAML, JSON, etc first have to go through a parsing phase before they can be used. This activity of parsing these files could be expensive, and could benefit from some kind of caching. Caching the definition in some way shape or form, would ensure there is minimal overhead per-request when the application is using this dependency management container.
Other Observations & Criticisms
It is important to realize that dependency management solutions in and of themselves are, in all the available words, full frameworks. They require that you understand both their philosophy as well have a minimal understanding of what facilities they are offering in order to use them effectively. To understand the true benefits of any framework one must first know the pain points the framework is attempting to solve. Seeing the end result of a framework without knowing what it is facilitating might lead to one to dismiss it as overkill or unintuitive. For example, take the following code (typical of dependency management systems)
$userRepository = $dic->get('UserRepository');
If you encounter this line of code without fully understanding the dependency injection container being used, you wouldn’t be able to appreciate it’s usefulness. You could instantiate your ApplicationModelUserRepository yourself, sure, but you’d also have to locate and inject the database adapter to use and into that you’d have to inject and load the configuration for that database connection. If you are doing this in multiple controller actions, there is a lot of repeated boilerplate code that is required to “wire” the UserRepository object. Internally, the DiC object is loading and consulting a definition, creating objects, injecting those objects, and returning the requested object that has been fully wired and ready to use.
The above code also demonstrates two common criticism of dependency management frameworks, which is also a criticism of frameworks in general. By using this framework, you are moving further away from the facilities of the language or platform itself. Instead of using the “new” keyword to create a new object, you’ve asked another object to create this requested object for you. What this has done has shifted developers away from utilizing the language’s well understood API and onto the framework’s API. Additionally, this kind of code is not easily understood by IDE’s. While special features could be added to the IDE to support this framework, it does not inherently know what kind of object is being returned by the $dic->get(..) method call.
Summary
While dependency management frameworks have clear drop-in benefits, there exist a few considerations that have unknown or unexplored consequences. For example, if the benefit is such that all dependencies are managed, and all a developer has to do is configure it, does that encourage deeper object graphs when creating classes and class dependencies? If so, what is the performance impact of these deep object graphs, particularly on the PHP platform. What are the memory implications of such object graphs, what are the speed implications of them? Furthermore, if one needed to debug an object that has been generated by a dependency management framework, is that easily possible?
At the end of the day, whether or not to use a dependency management framework is a matter of cost versus benefit. In order to be able to make an informed decision, a developer should consider a few scenarios. First, one should know what code might look like with and without this new framework. This will give an indication of the cost/benefit at the code level, does it actually save lines of code, and developer headaches? Secondly, one should consider how much added knowledge a developer or a team of developers need in order to understand this framework. Lastly, one should consider what kind of performance impact implementing this new framework has on the application’s throughput.
There are multiple metrics that are really useful for read workload analysis, that should all be tracked and looked at in performance-critical environments.
The most commonly used is of course Questions (or ‘Queries’, ‘COM_Select’) – this is probably primary finger-pointing metric that can be used in communication with different departments (“why did your qps go up by 30%?”) – it doesn’t always reveal actual cost, it can be increase of actual request rates, it can be new feature, it can be fat fingers error somewhere in the code or improperly handled cache failure.
Another important to note is Connections – MySQL’s costly bottleneck. Though most of users won’t be approaching ~10k/s area – at that point connection pooling starts actually making sense – it is worth to check for other reasons, such as “maybe we connect when we shouldn’t”, or needlessly reconnect, or actually should start looking more at thread cache performance or pooling options. There’re some neighboring metrics like ‘Bytes_sent’ – make sure you don’t hit 120MB/s on a gigabit network :-)
Other metrics usually are way more about what actually gets done. Major query efficiency signal for me for a long time used to be Innodb_rows_read. It is immediately pointing out when there are queries which don’t use indexes properly or are reading too much data. Gets a bit confusing if logical backup is running, but backup windows aside, this metric is probably one that is easy enough to track and understand. It has been extremely helpful to detect query plans gone wrong too – quite a few interesting edge cases could be resolved with FORCE INDEX (thats a topic for another post already :-)
For I/O heavy environments there’re few metrics that show mostly the same – Innodb_buffer_pool_reads, Innodb_data_reads, Innodb_pages_read – they all show how much your requests hit underlying storage – and higher increases ask for better data locality, more in-memory efficiency (smaller object sizes!) or simply more RAM/IO capacity.
For a long time lots of my metrics-oriented performance optimization could be summed up in this very simple ruleset:
- Number of rows shown to user in the UI has to be as close as possible to rows read from the index/table
- Number of physical I/Os done to serve rows has to be as close to 0 as possible :-)
Something I like to look at is the I/O queue size (both via iostat and from InnoDB’s point of view) – Innodb_data_pending_reads can tell how loaded your underlying storage is – on rotating media you can allow multiples of your disk count, on flash it can already mean something is odd. Do note, innodb_thread_concurrency can be a limiting factor here.
Overloads can be also detected from Threads_running – which is easy enough to track and extremely important quality of service data.
An interesting metric, that lately became more and more important for me is Innodb_buffer_pool_read_requests. Though it is often to use buffer pool efficiency in the ratio with ‘buffer pool reads’, it is actually much more interesting if compared against ‘Innodb_rows_read’. While Innodb_rows_read and Handler* metrics essentially show what has been delivered by InnoDB to upper SQL layer, there are certain expensive operations that are not accounted for, like index estimations.
Though tracking this activity helps I/O quite a bit (right FORCE INDEX reduces the amount of data that has to be cached in memory), there can be also various edge cases that will heavily hit CPU itself. A rough example could be:
SELECT * FROM table WHERE parent_id=X and type IN (1,2,4,6,8,…,20) LIMIT 10;
If there was an index on (parent_id,type) this query would look efficient, but would actually do range estimations for each type in the query, even if they would not be fetched anymore. It gets worse if there’s separate (type) index – each time query would be executed, records-in-rage estimation would be done for each type in IN() list – and usually discarded, as going after id/type lookup is much more efficient.
By looking at Innodb_buffer_pool_read_requests we could identify optimizer inefficiency cases like this – and FORCE INDEX made certain queries 30x faster, even if we forced exactly same indexes. Unfortunately, there is no per-session or per-query metric that would do same – it could be extremely useful in sample based profiling analysis.
Innodb_buffer_pool_read_requests:Innodb_rows_read ratio can vary due to multiple reasons – adaptive hash efficiency, deeper B-Trees because of wide keys (each tree node access will count in), etc – so there’s no constant baseline everyone should adjust to.
I deliberately left out query cache (here’s the reason), or adaptive hash (I don’t fully understand performance implications there :). In mysql@facebook builds we have some additional extremely useful instrumentation – wall clock seconds per various server operation types – execution, I/O, parsing, optimization, etc.
Of course, some people may point out that I’m writing here from a stone age, and that nowadays performance schema should be used. Maybe there will be more accurate ways to dissect workload costs, but nowadays one can spend few minutes looking at metrics mentioned above and have a decent understanding what the system is or should be doing.
Authored by Robert McAden, Email & Apps Senior Product Manager
One of the things that I’ve enjoyed most about my job is talking to customers and hearing how our hosted Rackspace Email and hosted Microsoft Exchange product offerings have relieved them of the headache of hosting their own email servers and having to worry about things like spam, blacklisting, etc. That’s why I’m excited to be part of the team that is bringing a very special offer to our hosted dedicated and cloud server customers from SendGrid, a Rackspace Cloud tools partner, to relieve the headache of transactional email delivery.
Transactional email includes password reminders, sign-up verification, and online purchase receipts – email that is critical for businesses and their customers. However, on average 20% of legitimate emails never reach the intended inbox. SendGrid takes care of the infrastructure and deliverability while providing customers real-time analytics such as delivery, open, and click through rates. Integration with your applications is easy through SMTP API, Web API, or even simple SMTP relay. In other words, fewer email headaches for customers.
Here is what two SendGrid customers have to say:
“We’re extremely happy with SendGrid. It was simple to configure and has given us great insights into how users view and use Gowalla emails. We don’t have to worry about scaling during periods of increased email volume, and we can focus on making Gowalla awesome without pouring time and effort into email configuration. SendGrid just works, and it works extremely well.”
- Adam McManus, Gowalla Senior Operations Engineer
“SlideShare used to handle email inhouse, and we did it right (with feedback loops and bounce processing and all the rest of it). It’s *really* complicated: you don’t want to do it yourself if you don’t have to. After switching to SendGrid, our email open rates are way up, and our operations team has more time to focus on other issues.”
- Jonathan Boutelle, SlideShare CTO
Much like our hosted Rackspace Email and hosted Microsoft Exchange platforms replace in-house mail servers for routine business email needs, SendGrid replaces outbound email infrastructure for transactional and even marketing email needs. SendGrid is also taking advantage of Rackspace’s infrastructure to provide an incredibly scalable service capable of handling their growing customer base. To date, SendGrid has delivered over 10 billion emails on behalf of more than 24,000 customers.
So yeah, I’m very excited to be part of the announcement that Rackspace has teamed up with SendGrid to provide SendGrid’s Bronze Plan to Rackspace hosted server customers free of charge! This Bronze Plan includes:
• Delivery of 40,000 emails each month;
• Access to the SendGrid SMTP/Web API, analytic tools, and;
• SendGrid support
Everyone knows a business needs profits, customers, and ethics. What not everyone knows is which of those should come first, second, and third. A lot of companies fail because they get the sequence wrong.
The most common mistake is to put profits first. That opens the door for bad things to happen. Numbers become all-important, and almost any behavior is justified in the name of profit. Cheating sets in.
Instead, a company's priority should be to protect and enhance its reputation through ethical behavior. Within the confines of that behavior, its next most important goal should be to attract and keep customers. Third is figuring out how to make money.
Ethics, customers, profit. Don't forget that.
That sequence tops my list of four business verities as I get ready to retire at the age of 81 from Ball Corporation in a few weeks. None of the verities is earth-shaking, but they're distilled from practices that have helped my company survive for 130 years while competitors bit the dust.
Verity number two is that when it comes to selling customers, all of us can create a credibility in here that no one of us can establish out there. Most companies go about attracting and keeping customers by beefing up the sales force and encouraging salespeople to make the company sound better than it really is. But that is BS. The way to ethically attract and keep customers is for the company to be incredibly good at what it does so that no one needs to put lipstick on the pig.
At Ball, our sales policy has been that we are a manufacturing-driven company — right into the arms of our customers. Our plants are in total tune with our customers. In our charter plants, plant managers have taken over all maintenance-selling duties, leaving salespeople to create new business full time. This manufacturing-driven sales policy has been unique in our industry and was one of the reasons why American, Continental, and National Can eventually had to go out of business.
Verity number three is that employees do best when they are led, not managed. When employees are asked for their advice, rather than being told what to do, they bring their best efforts, talents, and abilities to the table. True, employees sometimes need to have direction, but they don't respond well to being over-managed. It has long been a tenet of Ball that the combined knowledge and experience of the team is superior to the singular knowledge and experience of the manager.
Verity number four is that to be fully engaged, people need to know where the company is going. Everyone needs to be aligned around a goal that makes sense. In Ball's case, the ultimate goal set by the founding five brothers is to change people's lives for the better. So what makes our world go round? The process starts with management taking such good care of us, the employees, that we feel so good about the company and ourselves that with the help of our suppliers, we take such good care of our customers that they want to buy all we can make, giving us full-capacity utilization, which gives our investors such an outstanding return on their investment that our shareholder value keeps going up. In this process, all our stakeholders get rich, allowing all of us to enrich the communities where we live and work, thereby helping change everyone's life for the better.
Looking back over the years, it has been important to be able to employ the same value system in my business life that I chose to employ in my social life. Out of this has grown a love for all that this company represents and a pride in being able to help change so many peoples' lives for the better.
Clif Reichard (creichar@ball.com) is a sales consultant who has sold rigid packaging substrates for more than five decades.
Its been well over a year since I blogged here about Qt3D and its mission to bring Qt style to 3D programming with OpenGL. Qt3D started out as a Qt research project to bring Qt convenience and portability to OpenGL code, back in 2008/2009 – and it more than delivered on that promise with several of its classes finding their way into Qt’s OpenGL API.
But Qt is all about Qt Quick now, which those of you who have seen our QML 3D demo will know Qt Quick has changed the Qt3D project from a set of handy C++ API’s to an amazing 3D scripting environment. I spoke about Qt3D and also about our QML bindings at Dev Days last year and the response was huge. We also got the message from many of you that it was less exciting to have Qt style portability for OpenGL, now that OpenGL ES 2 was becoming a de-facto standard. This made the transition to Qt Quick for our front end even more important.
So we turned the corner and now I’d like to share with you our latest work: Qt Quick 3D! We have spent the last few months debugging, improving and packaging so that we could share with you. With Qt Quick 3D you have under the hood Qt3D with the power of C++ implemented 3D scene graphs, 3D asset loading (.3ds and other popular formats) but the developer API is all QML.
What can you do with Qt Quick 3D right now?
- Create a QML application that features 3D content
- Load models made in 3D studio max or Blender into your app
- Add stock shapes such as cylinders and cubes
- Insert inline shader code into the QML to create neat effects
- Animate your scene with geometric animations such as rotate, translate and scale
- Control your scenes with QML states, transitions and animations
- Write app logic in javascript, and use Qt Quick 2D alongside your 3D content
Of course we could not put everything we wanted in, so we have plenty to work on – maybe you have some special requests? What is coming in the future:
- Support for next version of QML (this will likely be the first cab off the rank).
- Network awareness (at present URL’s must be local file system only)
- Qt Creator integration for the model loader (to make it easier to position your 3D asset into your Qt Quick 3D scene)
- More sophisticated animations – skinning and morphing
- Physics engine integration
The packaging and Qt Quick integration is still in development, but we are coming to you with this now in the hopes that we get some useful feedback on the packages themselves, and also the functionality of Qt Quick 3D so that we can improve it. Working with packages has been a new challenge for the Qt3D team, since we have had to turn our hand from coding C++ and QML bindings to battling with the debian packaging system, Symbian .sis format and the intricacies of Windows and other desktop installers. The bugs we know about are listed on our public bug-tracker, which is also the right place to report any new issues or to vote up the ones you’d like fixed first.
Without further ado, here are the downloads:
For the source package, step-by-step building instructions for Qt Creator (including screenshots) are available on our Qt Quick 3D documentation page.
If you want to start working with Qt Quick 3D, give us feedback or get assistance, join the community on our qt-3d@nokia.com mailing list.
Last week, in my blog on the Maturity Level List and in the previous week’s Maturity Levels, I left some indications of what would be expected of a maintainer of a portion of the Qt codebase. In this blog I’d like to explain a bit more what’s expected of people working via the Qt Open Governance, what roles will exist and what responsibilities will each have.
In short, there will be three levels only:
- Contributor
- Approver
- Maintainer
The presence of a “Chief Maintainer” is not exactly a fourth level. More on that below.
Outside inspiration
For the Qt Open Source project, we set out with an idea that we wanted a very flat, very simple structure. The discussion on the mailing list was quite interesting, as we had in the beginning a top structure called a “board” or “oversight committee” (some early material I prepared included the recommendation: find a cheesy, non-sensical name out of a spy movie or series, like “Directorate”, “Oversight”, “Headquarters”, etc.) But in the end we decided to replace this and focus on existing examples of successful, large Open Source projects.
The two we mostly looked at were WebKit and the Linux kernel. There are benefits and drawbacks to both and we don’t think either would apply as-is. For Qt, we wanted to combine the best practices of each. We wanted to use the trust networks and use distributed development, but keep the number of hops between a new contributor and the code being in the final repository quite low. We also have a very aggressive regression testing and continuous integration system, which puts strong emphasis on quality.
The model I’m presenting here was discussed for a long time, but eventually it took one long IRC session between myself and Robin Burchell to put the ideas on paper. We posted all of our conclusions to the opengov mailing list then adjusted a little as the discussion required.
The Qt model
So we settled on basically two roles in addition to the contributor, as you can see in the drawing.
The Contributor is anyone who has the will and skill to contribute to the project. I should also point out that contributions come in many forms, not just code, but that’s what I’m focusing on here. The Contributor starts the process by doing the contribution over a code review tool we’ll make available and use ourselves too for our own work (more on the tool on another blog). Once this contribution is in, anyone can make comments, offer opinions, suggest improvements, etc. It is not necessary to have achieved any higher level to participate.
The next level is the Approver, which matches WebKit’s reviewer level. The Approver is a Contributor who has been given the right to approve a contribution for integration into the source code. Along with that right, he/she has the duty to ensure that the contribution follows certain guidelines, both quantitative, objective (the “Technical Fit”) as well as qualitative and subjective (the “Spirit Fit”). The Approver also has the duty to be constructive when offering suggestions and even when rejecting. To become an Approver, a Contributor must basically earn the right, by making contributions and proving that he/she has the best interest of the project at heart. An existing Approver should nominate, another Approver second it and, provided there are no objections, the rights are given.
Rights easily given are also easily taken away. Approvers are expected to act in the best interests of the project, regardless of their work affiliation. They are given the approval rights over the entire codebase, but they are expected to exercise them with care, concentrating on the parts where they have knowledge about. Abuse is punished by the community by revoking those rights — in my experience of Open Source projects, this happens very, very seldom.
In most circumstances, this is all there is to it. A contribution is made, reviewed by one or more people, an Approver, who may or may not be one of the reviewers, approves the contribution, and then the continuous integration system should take it away, test and integrate into the codebase. This is where we are supposed to be very close to the WebKit model.
So where does the Maintainer come in? Well, the Maintainer is a respected member of the community who knows quite well the code in question. Unlike the Approvers, the Maintainers are tied to the code. A Maintainer basically has one duty and one right to the particular codebase he/she’s maintaining:
- The duty: ensure that the codebase is always ready for beta. That is, ensure that all contributions coming in are in the proper state to start the release process at any time (tested, stable, feature-complete, documented, API reviewed, etc.), ensure that the code is in that state all the time.
- The right: set future direction, reject contributions not following that directive (including removing code already approved because it isn’t stable or finished), etc.
Along with those, there are other duties, like knowing what projects are happening in that module, offering advice to people who want to contribute new features to that module and ensuring that all contributions are reviewed (so they have to review if no one else does). Moreover, Maintainers are the people whom QA and the Release Team will contact in case something is wrong.
As the codebase grows, the need for new Maintainers will appear, so we always keep a healthy number of them. That is, we should have as many Maintainers as are really required to execute the job comfortably — but no more. It is my expectation that whenever that happens, there will be a natural candidate for Maintainer: someone who is already mostly doing that job of counselling and oversight of the codebase. Another case for a person becoming a Maintainer is when the previous one is no longer doing the job properly, either because of lack of time and/or interest, or because the actions being taken are not in the best interest of the project.
Maintainers can, of course, delegate. As an example, imagine QtCore is a module and it has a Maintainer. This person may delegate all of the Tools and Containers subsystem to a person of his trust, while he/she focuses on another part of the module. It’s important to note that this is a trust-based relationship: the actions of the sub-maintainer reflect on the Maintainer. Big modules like Qt 5′s “qtbase” will have several maintainers dividing the work.
The Chief Maintainer is not to be understood as a fourth level in the structure. The Chief Maintainer is a Maintainer like the others (we’re expecting this to be the Maintainer of the “qtbase” module, in fact). Instead, this person should be regarded as primus inter pares: a person whose opinion matters a great deal to the project, due to past and present contributions. Therefore, this person is accorded the right to break bigger stalemates and make overarching decisions. This person is recognised as such, which is why we’re singling him/her out here.
You may be asking who will be allowed to fill in those roles. They are all open to anyone who shows merit, deserving to be in that capacity, regardless of work affiliation. However, when we go live, we will “bootstrap” the model with the people who are already working on Qt: that is, the Trolls and existing contributors. The names of the people should come naturally, but to be on the safe side we should allow for an adjustment period. Also, it’s clear that the “bootstrap” will include mostly current Nokia employees. As time progresses, we expect non-Nokians to join in and participate, earning responsibilities.
A final note on other roles: the above description is really focused on the code workflow, which is what happens for virtually every Open Source project out there. But in addition to the above roles, there will be many other roles to make the project tick, like the QA team, the documentation team, the Release Management team, Community Management, communications and web presence, roadmapping and planning, etc. Each of those teams will probably work with just two levels: contributor and team lead (similar to the Maintainer), unless they choose otherwise. The people from Nokia currently executing those roles will be present at the Qt Contributor Summit, so we invite those interested in collaborating to talk to them and organise sessions during the summit to suggest ways of working together.
SQL injection is wonderful! MySQL Proxy can do it, mysqlnd plugins - even written in PHP (not Lua or C) - can do it. Global Transaction IDs are wonderful. A mashup of the PHP replication plugin and global transaction ID injection, makes your replication cluster fail-over much smoother and opens up an opportunity for an API to support consistent reads from slaves "immediately" after a write. Less hassle identifying and promoting a new master for fail-over, even better read load balancing - my last proposal for the future of the PHP replication plugin.
What?
Think of a global transaction ID as a unique identifier for a change set in a database cluster. Replicas in the cluster use the global transaction ID to track changes. Because global transaction IDs are unique cluster-wide, you can easily compare the replication progress among the replicas, in particular, if the global transaction ID contains a sequence number.
MySQL replication does not use cluster-wide global transaction IDs. Change sets are identified by a log file name and a log file position. Name and position are local to a machine. The tuple (file=001, position=1272) on slave A does not necessarily address the same change set from master B as the tuple (file=001, position=1272) on slave C also replicating from master B.
Use case: server fail-over
Let’s say you have MySQL replication cluster like A -> B -> C. A is the master. B is a first level slave, C is a second level slave. You write some data on the master. This creates a logical change set. The master records the change set in (file=001, position=100). B replicates it and puts the change set in (file=004, position=272). C reads the change set from B and remembers the position (file=004, position=272) to continue reading. B fails and you want C to continue reading changes from A. How do you find that B needs to continue reading from A at (file=001, position=100) ? It is not always an easy task…
| One change set (INSERT), three identifiers (file, position) | ||
|---|---|---|
| Server A, master | ||
| file=001 | position=100 | INSERT … |
| Server B, first level slave | ||
| file=004 | position=272 | INSERT … |
| Server C, second level slave | ||
| file=002 | position=161 | INSERT … |
Things would be much easier if there was a global transaction ID. The master (server A) would record a global_trx_id=18271 together with the change set. B would leave it untouched and add the global_trx_id=18271 to its change log. C would read the change set 18271 and remember the position. If B fails, C could continue reading changes from A immediately. C would know that it has to continue at 18271. This is the idea behind WorkLog #3584.
| One change set (INSERT), one global transaction ID | |||
|---|---|---|---|
| Server A, master | |||
| file=001 | position=100 | global_trx_id=18271 | INSERT INTO … |
| Server B, first level slave | |||
| file=004 | position=272 | global_trx_id=18271 | INSERT … |
| Server C, second level slave | |||
| file=002 | position=161 | global_trx_id=18271 | INSERT … |
Jan shows in MySQL Proxy and a Global Transaction ID a wonderful, client-side, SQL injection based hack to maintain a global transaction ID. The PHP mysqlnd replication and load balancing plugin could learn to do the very same for you.
Create a MEMORY table which is replicated with a single UNSIGNED BIGINT and increment it at the end of each transaction.
CREATE TABLE trx (
trx_id BIGINT UNSIGNED NOT NULL
) ENGINE=memory;
INSERT INTO trx VALUES ( 0 );
When ever you commit a transaction UPDATE the trx_id field:
UPDATE trx SET trx_id = trx_id + 1
From: http://jan.kneschke.de/projects/mysql/mysql-proxy-and-a-global-transaction-id/
Dear MySQL database administrator, please, drop us a note, if you want this to be implemented in the PHP plugin! And now, to you, dear PHP application developer.
Use-case: consistent read from slaves
MySQL replication is asynchronous. Slaves can lag behind masters, slaves may not have replicated all writes performed on the masters. A read from a slave is eventual consistent. If you need consistent reads, you usually have to read from the master. The mysqlnd plugin has already a neat configuration option (master_on_write) to automatically read only from the master after the first write . However, this adds read load to the master and MySQL replication is about read scale out …
By help of a global transaction ID the client can even try to do a consistent read from (asynchronous) slaves after a write to the master. If the client knows the global transaction ID of the write, it can identify a slave for reading which has already replicated the write.
/* write to master */
$link->query("INSERT ...");
$global_trx_id = mysqlnd_ms_get_last_global_transaction_id();
/*
read from a slave which has replicated global transaction ID or,
if no slave found, use master for consistent read
*/
mysqlnd_ms_set_service_level($link, MIN_GLOBAL_TRX_ID, $global_trx_id);
$link->query("SELECT ...");
$link->query("SELECT ...");
Whether the plugins API should be build on top of the idea of a service level (pseudo-code above), the plugin should search a consistent slave before issuing the read or it is left to the application to try an optimistic read, is not nailed in stone. Blog comments are as welcome as are contributions to the wiki page https://wiki.php.net/pecl/mysqlnd_ms?&#raw_bin_ideas_rfcs…
Issues
Client-side SQL insertion is a hack. In an heterogenous environment, with clients of different kind, it is all to easy to forget to insert the necessary SQL.
| PHP 5.3+ | PHP 5.2 | MySQL promt | |
| Global trx id insertion? | Automatic, Plugin | Manual | Manual |
Which SQL exactly? Good question. Jan is giving an example with a MEMORY table in his blog post. A perfect choice to demo the idea, a perfect choice for a blog post! But good enough for production use? Giuseppe proposes hacking the hack - both the hack and his hack of the hack use different SQL than Jan does.
Choose yourself. My thinking is that the plugin should - at best - propose a generic approach in the documentation but otherwise provide hooks to let the application insert whatever SQL is best for the application.
End of brainstorming
All of the above is brainstorming. No promise it will ever materialize.
I don’t have any more "big" ideas or keywords in my notes for the PHP replication plugin. Its time for me to go back to the source editor for checking what Andrey did to prepare support for partitioned replication…





























