Archive for the ‘Programming’ Category
OpenSocial is now available in our VZ Sandbox
We are proud to announce that our Gadget Sandbox with OpenSocial 0.8.1 integration is up and running. Developers can upload and test their gadgets against our platforms and request the approval to make them available to our users.
All interested developers are invited to join our OpenSocial support group on meinVZ and studiVZ where more information can be found about how to become a VZ OpenSocial Gadget developer.
We strongly believe that OpenSocial should also been taken literally so we started a monthly GeekNight where all interested developers are informed about our current projects and the release status of our OpenSocial implementation.
OpenSocial will become fully available by the end of the year. This leaves enough room to be out with stable and tested gadgets for the container’s OpenSocial launch.
However we already use GoogleGadgets to provide interactive ads and games on our platforms.
Automated acceptance tests using Selenium Grid without parallelization
Originally we built our automated acceptance tests within our agile development process on a continuous integration server using Selenium with only one Selenium Remote Control. The tests were executed on a fixed browser under a specific operating system. With the growth of the test cases in the test suite the execution time of the builds extended rapidly, so the tests could not directly identify defects and thus a part of their function was lost.
The common strategy to solve this problem is to install Selenium Grid on the machine that formerly hosted the one Selenium RC to parallelize the execution of the tests. By connecting only 4 Selenium RC’s to this Grid Hub the execution time of these tests is reduced by a factor of 4 without any additional hardware and without rewriting the tests. The only prerequisite for this is that the used testing framework supports a parallelized test execution, i.e. it must be able to start more than one test of a test suite simultaneously and assign the answers supplied by Selenium Grid to the right test again.
Although our used testing framework PHPUnit does not provide a parallel execution of tests yet we found a way to use the benefits of Selenium Grid in our testing environment. Our continuous integration server provides the possibility to set up more than one build agent to run the Selenium driven acceptance tests. If we would do this with one single Selenium RC these agents would stress this RC rapidly because there is no possibility to check its state. The agents would start new tests no matter how many tests are already running at this RC.
So we installed Selenium Grid with 4 connected Remote Controls as described above. We can now control the number of simultaneously running tests, because Selenium Grid starts only so many test suites as RC’s are connected. Other incoming requests are queued in the Selenium Grid Hub until one of the connected RC’s has finished its test suite. Unlike the common usage of Selenium Grid we have not yet a real parallelization with this solution, since the test suites from the build agents of the continuous integration server run simultaneously, but each is still to be processed sequentially. But we have always the option of switching to a real parallelization when our testing framework supports it.
mckoy – [m]em[c]ache [k]ey [o]bservation [y]ield
We wanted to speed up our web-applications by alleviating our database-loads. So we decided to use the distributed memory object caching system, memcached. Due to the many requests of our memcached-systems (about 1.5 million requests per second), we built a tool (called mckoy), which is capable to perform statistics and debugging information about all memcache-requests in our network.
mckoy is a memcache protocol sniffer (based on pcap library) and statistics builder. It automatically detects and parses each key (and its value) and memcache-api methods. At the end of the sniffing session, the results are used to build the statisticis. mckoy was written to analyse our web application and its usage of memcache-api in PHP. For example: We wanted to know how many set() and get() methods were invoked in a given time. Based on these results, we had to make changes to improve the usage of memcache-api for PHP. You can run mckoy on any UNIX based systems. It was tested on many *BSD and Linux systems. mckoy is licensed under GPLv3 and completely published as opensource project!
You can run mckoy in various modes (see manpage!). For example, if you want to sniff pattern “foobar” for all memcache-api methods and with live capturing, use:
mckoy -i <interface> -e “port 11211″ -m 5 -k foobar -v
And this is, how it looks like:

Unfortunately, there are some known bugs. :) For example: An SIGSEGV will encounter when ^C is sent from user. Also, we noticed that mckoy isn’t able to handle memcached-1.2.8 <= 1.4.* correctly. These bugs will be fixed in the next version as soon as possible! For the next version I also planned to build in udp and binary support.
You can offcially download mckoy from:
http://www.lamergarten.de/releases.html
or
http://sourceforge.net/projects/mckoy/
cheers.
About Erlang/OTP and Multi-core performance in particular – Kenneth Lundin
I attended an awesome talk by Kenneth Lundin about Erlang/OTP at the Erlang Factory in London. The main topic was SMP and it’s improvements it in the latest release(s). That’s exactly one of the main reasons for Erlang, parallelize computations on many cores, without worrying about locks in shared memory.
Some of the issues they’ve been working on:
- Erlang now detects CPU Topology automatically at startup.
- Multiple run-queues
- You can lock schedulers to logical CPU’S
- Improved message passing – reduced lock time
They improved more things of course but considering SMP these are the most important ones.
- Erlang now detects the CPU topology of your system automatically at startup. You may still override this automatic setup using:
erl +sct L0-3c0-3
erlang:system_flag(cpu_topology,CpuTopology). - Multiple run queues … what does that mean? We should first take a look at how Erlang does SMP:
- Erlang without SMP:
Without SMP support the Erlang VM had one Scheduler for one runqueue. So all the jobs were pushed on one queue and fetched by one scheduler. - Erlang SMP / before R13
They started more schedulers that were pulling jobs from one queue. Sounds more parallel but still not performing as good as desired on many cores. - Erlang SMP R13
Several schedulers like in the former solution but each of them has it’s own runqueue. The problem with this approach is that it can of course happen that you end up with some empty and some full queues because of the different runtime of the processes. So they build something called migration logic that is controlling and balancing the different runqueues.
They migration logic does:
- collect statistics about the maxlength of all scheduler’s runqueues
- setup migration paths
- Take away jobs from full-load schedulers and pushing jobs on low load scheduler queues
Running on full load or not! If all schedulers are not fully loaded, jobs will be migrated to schedulers with lower id’s and thus making some schedulers inactive.
This makes perfectly sense because the more schedulers and runqueues you need the more migrating has to be done. Using SMP support with many schedulers makes only sense if you’re really optimizing for many cores and you will have decreased performance on systems with few cores.
- Erlang without SMP:
- Binding schedulers to CPU’s is really worth looking at it. The more cores your CPU has the more important it’ll be and the more performance improvement you’ll gain. You can force the erlang VM to do scheduler binding by:
erl +sbt db
erlang:system_flag(scheduler_bind_type,default_bind).1>erlang:system_info(cpu_topology).
[{processor,[{core,{logical,0}},
{core,{logical,3}},
{core,{logical,1}},
{core,{logical,2}}]}]
2> erlang:system_info(scheduler_bindings).
{unbound,unbound,unbound,unbound}
fabrizio@machine:~$ erl +sbt db
1> erlang:system_info(scheduler_bindings).
{0,1,3,2}

Source: presentation Kenneth Lundin – Erlang-Factory
You can test and benchmark SMP using following flags:
fabrizio@machine:~$ erl -smp disable //default is auto
fabrizio@machine:~$ erl +S 2:4 //Number of Schedulers : Schedulers online
With erlang:system_info/1 you can use the following atoms
# cpu_topology
# multi_scheduling
# scheduler_bind_type
scheduler_bindings
logical_processors
multi_scheduling_blockers
scheduler_id
schedulers
# schedulers_online
smp_support
The ones marked with # can be set using system_flag/2
Memcache Feeds
Buschfunk, die Möglichkeit die Statusnachrichten (”Ist gerade…”) deiner Freunde auf unseren VZ’s anzuzeigen, ist nun/nur der Beginn der VZ Feeds. Letzlich ist der Buschfunk nur die Zusammenführung aller Statusnachrichten deiner Freunde auf der Startseite.
Nach dem Launch der ersten Version des Buschfunks, gab es bereits wenige Minuten später einen riesigen Impakt auf unserer Serverfarm. Dies führte dazu, dass wir bereits nach einem Tag die Statusnachrichtendatenbank auf eigene Server umziehen mussten. Ihr seht hier die absolute Anzahl von Statusaktualisierungen pro Minute (getrennt nach studiVZ/meinVZ = blau/flacher Graph und schuelerVZ = rot/steiler Graph):

Man kann Feeds, also Mitteilungen über Statusaktualisierungen eines Freundes, unterschiedlich implementieren und stellt sich dabei einigen Herausforderungen, gerade wenn es nicht nur um die optimale Speicherung, sondern auch um performante Zugriffe und logisches Zusammenführen von ähnlichen Feedeinträgen geht.
Wir haben uns für erste Tests in Richtung Social-Feeds für eine reine Memcache-Implementierung entschieden. Man hat den Vorteil, dass man die ohnehin nur momentan interessanten Posts von Usern nicht in der Datenbank vorhalten muss und Memcacheoperationen dazu noch um einiges schneller sind. Für einen Feed baut man sich im einfachsten Fall eine Queue pro User mit sortierten Einträgen auf, die man als Entity im Memcache ablegt. Da man nicht unendlich viele Einträge vorhalten muss und ein Memcacheobjekt per default eh nur 1 MB pro Eintrag groß sein darf, limitiert man die Queue auf eine feste Anzahl und wirft bei einem neuen Eintrag einfach alte Einträge weg.
Ok, genug der Einleitung, kommen wir zu ein paar Codeschnippseln. Am besten baut man sich ein Interface, welches sich um das Handling von Feedeinträgen kümmert:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | interface FeedEntry { //referenziertes Memcacheobject, also eigentlicher Inhalt public function getFeedEntryReference(...); //ist der Eintrag sichtbar -> Privacy public function isFeedEntryHidden(...); //initialer Aufbau public function initializeFeedEntry(...); [...] } |
Jetzt haben wir einen beliebig erweiterbaren Feed Typen und können mit Implementierungen, wie zum Beispiel Statusänderungen von Nutzerprofilen, Microblogeinträgen etc, beginnen.
Was man jetzt noch braucht, ist das eigentliche Aufbauen der Queue, also Füllen des Feeds mit Feed Typen. Man sollte sich überlegen, woraus ein Eintrag innerhalb der Queue aussehen soll. Es sollte die id des Users, einen Zeitstempel, sowie den eigentlichen Inhalt enthalten. Wir haben uns dafür entschieden, den Inhalt des Eintrages nur als Referenz in der Queue zu halten, damit bei Änderungen nicht jedes Queueobjekt, sondern lediglich das referenzierte Memcacheobjekt geändert werden muss. Ausserdem will man die Daten nicht doppelt im Speicher halten.
Beispielhaft könnte eine vereinfachte Queue folgendermaßen aussehen:
1 2 3 4 5 | $queue = array( 0 => array ('timestamp' => time(), 'userId' => 123456789, 'contentId' => 1000, 'type' => TYPE_MICROBLOG), 1 => array ('timestamp' => time(), 'userId' => 1234567910, 'contentId' => 1001, 'type' => TYPE_PHOTOCOMMENT) [...] ); |
Mit Hilfe des Typs und dessen konkreter Implementierung eines Feed Entries kann nun der eigentliche Inhalt aus einem weiteren Memcacheobjekt oder aus der Datenbank geholt werden. Sortiert ist die Queue bereits nach dem Einfügen eines neuen Entries. Wird nun der Content hinter einer solchen Referenz gelöscht, so braucht man die Queues der User, dessen Feeds beeinflusst werden, nicht updaten, da der Feed beim Einlesen automatisch merkt, dass die Referenz nicht mehr gültig ist und diese “überspringt”.
Für’s Aufbauen der Queue noch ein Tipp: Verwendet lieber mehrmaliges array_reverse im Zusammenhang mit array_push, anstatt ein array_shift! Das ist um Welten schneller, wenn man ein Element vorn ranhängt bzw. hinten anfügt.
(Quelle: http://www.ingo-schramm.de/blog/archives/9-PHP-array_shift-does-not-scale.html)
Da der Memcache, wenn der ihm zugewiesene Speicher vollläuft, wenig frequentierte Bereiche freigibt, muss man sich zwangsweise überlegen, wie man mit Datenverlusten innerhalb der Queue umgeht. Dazu könnte man zyklische Backups der Queueeinträge oder ein initiales Befüllen der Feedeinträge implementieren (im Interface bereits vorgesehen). Die eigentlichen Daten (bsp. das konkrete Statusupdate) bleiben natürlich erhalten und liegen in der Datenbank persistent vor, es geht hierbei nur um die Referenzen auf diese Einträge.
Erlang R13A Benchmark
I made a little benchmark to check out the new Erlang Release R13A and the behavior of the multiple run queues. The benchmarking program was the same I used in another benchmark you may find here. You may also find the sources at that location. As already noted there the slope from 1 CPU to 2 CPUs is due to the “bad” implementation made to challange the Erlang SMP features. The mashine was an 8 core Intel Xeon 3 GHz with a 64bit 2.6.9 Linux, Erlang kernel polling active.
Da Guckst Du – Hingucker auf dem Profil
Der Nächste Schritt in Richtung Google Gadget Integration auf unserer Plattform ist abgeschlossen. Und wir haben ein neues Baby, den “Hingucker”.
Ab jetzt kann jeder Benutzer seinen Lieblingshingucker aus einer Gruppe auf sein eigenes Profil entführen.
Wir beginnen mit dem 11Freunde Hingucker (StudiVZ/MeinVZ).
Viel Spaß!

Ein Stückchen studiVZ – Web Slices
Mal wieder steht der Release eines neuen Browsers auf der Agenda, denn Microsoft bringt mit dem Internet Explorer 8 eine brandneue Reinkarnation seines Haus-und-Hof-Browsers auf den Markt. Schnell soll er sein – laut eigenen Aussagen – und einige neue Funktionen erwarten uns: Angefangen von Accelerators (z.B. Adresse markieren und direkt auf Google Live Maps anschauen) über InPrivate Browsing (auch bekannt als “Porno-Funktion”) bis hin zum SmartScreen Filter, der uns vor bösen Webseiten schützen soll. Und natürlich viel viel mehr.
Intern waren vor allem unsere HTML/CSS und JavaScript-Fachkräfte schon tatkräftig bei der Arbeit, um unsere Seite (und vor allem den Chat) für den IE8 fit zu machen. Erfreulicherweise gibt es aber garnicht mal soviele Unterschiede zum IE7, somit gehen die Anpassungen relativ schnell von der Hand.
Eine weitere – vielleicht sogar die größte – Neuerung, wird bereits jetzt von unseren Platformen unterstützt: Die Web Slices (Web Slice Format Specification). Web Slices sind eigentlich eine Mischung aus Favoriten mit Vorschaufunktion und RSS-Feeds. Man kann sich Bestandteile einer Webseite abonieren, die dann in regelmäßigen Zeitabständen automatisch aktualisiert werden. Der Anwender wird dann automatisch über Veränderungen informiert und muss nicht jedesmal selbst aktiv werden. Ansich also eine ganz nette Funktionalität. Auch in der IE-Addon Gallery sind wir schon vertreten, zu bestaunen unter www.ieaddons.com/de/social.
Bei uns kann man ein Web Slice hinzufügen, mit dem man den vollen Überblick über Aktivitäten im eigenen Netzwerk bekommt. Wer war auf meiner Seite? Habe ich neue Nachrichten? Hat mir jemand etwas gezeigt? Das sind nur einige Fragen, die dort beantwortet werden und zwar egal auf welcher Seite man gerade unterwegs ist. Voraussetzung ist, dass der Nutzer unsere (auch recht neue) Remember-Me-Funktion (die persistener Cookie beim Anwender setzt) aktiviert hat, sonst muss man sich jedesmal neu einloggen.
Technisch ist ein Web Slice simpel aufgebaut:
- Ein umgebendes (DIV-)Tag mit der Klasse “hslice” und einer ID
- Ein Tag mit der Klasse “entry-title”, deren Inhalt den Titel definiert
- Ein Tag mit der Klasse “entry-content” für den eigentlichen Inhalt der angezeigt werden soll
Darüber hinaus kann dieses Basis-Konzept eines Web Slices natürlich noch erweitert werden. Bei studiVZ kommt z.B. eine “Alternative Display Source” und eine “Alternative Update Source” zum Einsatz, damit nicht bei jedem Update die komplette Startseite aufgerufen werden muss, sondern wirklich nur das, was auch im Web Slice angezeigt wird.
Die nachstehende Grafik soll die Zusammenhänge nochmals verdeutlichen:

WebSlice bei studiVZ
Wenn ihr den IE8 bereits am Start habt, könnt ihr das Web Slice auf der Startseite oder in der IE-Addon Gallery abonieren. Natürlich gibt’s das Web Slice für alle drei Platformen.
Zusatz:
Auch für den Firefox wird bereits ein Plug-In für die Web Slices und die Activities umgesetzt. Sieht auf jeden Fall interessant aus.
PHP SPL Data Structures Benchmark
Data structures and collections are one of the most wanted features for the Standard PHP Library SPL over the last few years. With PHP 5.3 we’ll finally get a little of what we want and this is good news. With data structures like stack, queue, heap or priority queue implemented in C we expect PHP programming to become somewhat more efficient.
Inspired by this post http://blueparabola.com/blog/spl-deserves-some-reiteration we decided to run our own benchmarks to either verify or disapprove the results posted. Our benchmarks were executed on a 64bit RHEL with PHP 5.3.0beta1. As you may expect, we carefully excluded startup or compilation time and measured only the code blocks in question. We used getrusage() to determine CPU time consumption. A huge number of iterations guaranteed smooth results.
The first structure under consideration was the SplFixedArray. If you only need numerical indices you now can create an array of fixed size that does not have to grow while more and more items are inserted. Dealing with an SplFixedArray saves you about 10 percent of runtime compared to a plain old PHP array.
Next we tried the SplStack and SplQueue. It is usually easy to implement a stack and a queue with plain arrays. Using array_push(), array_pop() and array_shift() is straightforward. It may be a surprise to the average PHP programmer to learn about the runtime behaviour of these functions. Worst is array_shift() because of the internal rehashing and the experienced PHP programmer may – for critical code at least – try to access arrays by indices maintaining counters, for example. This is much more efficient. Compared to the functions, at least SplQueue is something like an upset, but it is possible to find comparable solutions with plain PHP.
There is a little danger to compare apples and pears when turning towards SplHeap and SplPriorityQueue. What is the proper representation of a heap implemented using plain old arrays only? It’s a sorted array, ok. But a heap is sorted for each insert, so, do we really have to sort the array for each insert? Who will do this in real life?
It’s the use case that decides about the sorting strategy. If you are supposed to carefully separate writing the heap and reading from it, it is sufficient to sort it once. That way you beat SPL. But if you have to mix reading and writing arbitrarily the SPL will beat plain arrays by far. This is shown in the pictures below. For the mixed strategy we read once for 5 inserts and the SplMinHeap scales very well. The same holds for SplMaxHeap and SplPriorityQueue.
Lessons learned:
- SPL rules
- use SPL data structures where appropriate for a particular use case, they are efficient and comfortable
- benchmarking is error prone
- anyway, always benchmark performance critical code blocks
A good piece of geek stuff – client side image processing with Gears
In the latest downtime we released a beta version of a new photo uploader. Since we removed the Java based uploader some time ago, we’ve been dreaming of offering our users an uploader that’s able to do the same as the applet did, without requiring Java and of course with less conflicts on the different clients.
So the requirements for the uploader were :
- uploading (of course :D)
- scaling
- rotating
- compression
We implemented a Flash based uploader that could do a multiupload but neither scale nor compress the pictures before upload. So there is the problem, no direct access to the local files on the client.
On the Google Developer Days in Munich last year a collegue and I heard of the possibilities Gears offers and we were quite suprised how far it pushes the abilities of the client. Dreaming of all the geeky things I could do with Gears I also hoped to be able to solve that fileaccess problem. But unfortunately Gears did not offer the announced canvas API and also the desktop API hadn’t implemented the needed interfaces so far.
Enough story let’s look at the code …
var desktop = google.gears.factory.create('beta.desktop');
var localServer = google.gears.factory.create('beta.localserver');
var store = localServer.createStore('picturesTemp');
November 24, 2008: Google released the 0.5 version of Gears and there it was, the local server offered captureBlob() .
We created the needed Gears features, desktop for the fileaccess, localserver to store the files on the client.
$('#openFile').bind('click', function(){
gearsComponents_.desktop.openFiles(openFilesCallback_, {
filter: ['image/jpeg', 'image/png', 'image/bmp', 'image/gif']
});
});
Here we’re binding the filepicker dialog to some button, providing a filter to delimit the shown files to supported types.
var url = yourdomain;
var openFilesCallback_ = function(files){
var file = files_.shift();
gearsComponents_.pictureStorage.captureBlob(file.blob, url, "image/" + file.name.substring(file.name.lastIndexOf('.') + 1));
}
Capturing the blob to the local server like shown above solves two problems. Now we can access the files and import them into the canvas element, because it can be delivered via the same domain as the main page so there is no crossdomain security problem.
var canvas = $('<canvas>').get(0);
var context = canvas.getContext('2d');
canvas.width = canvasOriginal.width * fac;
canvas.height = canvasOriginal.height * fac;
context.scale(fac, fac);
context.drawImage(canvasOriginal, 0, 0);
In this example the canvasOriginal is the canvas/image from the localserver. We can now rotate, scale etc. the picture if the browser supports these actions on the canvas element.
You can also apply filters on the images by extracting the picture information as pixelarray modifying it and pushing it back:
// get the imagedata
var imgdata = canvasOriginalContext.getImageData(0, 0, canvasOriginal.width, canvasOriginal.height);
// do something with the pixel data
// push the imagedata back
context.putImageData(processedData, 0, 0);
As Javascript blocks while executing code this will for sure cause serious GUI problems, so let’s use a Gears worker to solve that problem.
First we have to create a workerpool:
var workerPool = (function(){
if (window.google) {
return google.gears.factory.create('beta.workerpool');
}
}());
The workerpool needs an onmessage handler which will be called on receiving messages by a childworker:
workerPool.onmessage = function(a, b, message){
//message will contain our processed pixelarray
};
The workers have no access to the dom so we only push the pixelarray in and get it back in the onmessage handler
// accessing the predefined workerpool
var script = 'var wp = google.gears.workerPool;' +
'wp.onmessage = function(a, b, message) {' +
'var data = message.body[0];' +
'//Process the data here' +
'//send the data back to the worker pool'
'wp.sendMessage(reply, message.sender);' +
// create a childworker by script (could also be created by url pointing to a script)
var childWorkerId = workerPool.createWorker(script);
We currently only use workers to apply a filter to the pictures.
Now we can do anything we want with the pictures but what about sending them to the server.
The solution to that problem is the toDataURL() method of the canvas element which exports the pictures to a “data URL”, which we can send to the server by a simple xhr as post.
According to the HTML5 spec toDataURL() should support several parameters first the data type – e.g. toDataURL(’image/jpeg’) – the second parameter should be the compression rate as float value.
The compression rate does not seem to be supported by Firefox so far, so you should leave it blank then Firefox uses it’s default value.
The released beta version of the Gears uploader only supports Firefox because of the missing canvas (especially picture export) support in most of the browsers. Another drawback is that Gears is not available on all OS/browser combinations.
I hope that Safari will soon support (didn’t look at Safari 4 so far) the needed Canvas export methods, so that the uploader will work with Mac/Safari.
The CanvasAPI of Gears is already available in the sources but it’s not sure if and when it will be released. So perhaps some day there will be also a canvas element available in IE via Gears.
Have fun with the Gears uploader!










