Cheerp is a C++ compiler for the Web platform. Roughly a year ago we released Cheerp 1.0 with the promise of making C++ a first class language for the Web, with full access to DOM and HTML5 APIs (including WebGL) and great performance. At that time, we could only partially meet that promise.
With our early adopters starting to use Cheerp on real world, large scale applications, we were proud to see that Cheerp could indeed be used to seamlessly integrate C++ code into HTML5 apps. But we also realized that the performance of the compiled code was disappointing on real codebases.
As an example, our first benchmarking result on our first large-scale (~1M sloc) customer code was around forty times (40x) slower than native. Not only was this result disappointing, but it also was much worse than what we were expecting based on our internal benchmarks.
One year later, after significant effort focused on performance optimizations, we are here today to announce Cheerp 1.1.
Now we can fulfill our original promise.
This four- to six-fold improvement in speed has been validated not only on publicly available benchmarks, but also verified on real-world commercial code.
Cheerp 1.1 is now robust, battle-hardened and ready to be used on complex, large-scale codebases. Download it from our website
In this long post, we draw a renewed comparison with Emscripten, present recent performance benchmarks against native and analyze future steps for Cheerp.
How is Cheerp different from Emscripten?
This approach makes it possible to port code with small changes and a high output performance. It also comes at a cost, as the typed array that stores all the data needs to be pre-allocated to its maximum size at the beginning of the program.
This means that the developer must be able to decide what the maximum memory size is at build time, with no possibility of scaling it with the capabilities of the users’ devices, and no dynamic memory allocation.
This is, in our opinion, a far from ideal approach, since it both creates a bad incentive for developers - preallocating a large amount of memory such as 512MB or 1GB is the easiest solution to this problem, and it basically prevents the coexistence of multiple applications created following this paradigm. This is especially true in web environments with bigger memory limitations such as mobile devices.
Last but not least, currently asm.js code limits by specification the size of typed arrays, and hence the maximum allocable memory for each application, to 2GB.
To make this possible, Cheerp requires the developer to apply stricter than usual type safety to C++, which results in some additional porting work.
But it turns out that most C++ is usually well behaved and type safe already, and on our customers codebases we have been able to fully port the code by changing around 0.1% of the original code, which we believe is a reasonable porting effort.
This is especially true since most of the changes tend to amount to a few repeated interventions on type safety and a limited amount of architectural changes required to bring an application to the Web, such as interfacing with DOM events and removing threading code. Most of these architectural changes would be needed anyway, as the web platform has some intrinsic differences which cannot be emulated (e.g. no real multi-threading).
With Cheerp, out of bounds accesses on an object will not give access to any other data structure. We believe that the security implications of Emscripten have not been properly understood and we intend to make a detailed post about this in the future.
On asm.js and startup time
Being fans of typed languages, we really like this idea.
One interesting side effect of using asm.js is that the browser needs to fully validate the entire code before being able to execute it, which introduces a large start time penalty (as analyzed by the Emscripten team here).
It is very apparent that Cheerp code starts execution much faster than Emscripten+ams.js (approximately 80% faster) and also of Emscripten code compiled without asm.js (40% faster).
We have benchmarked Cheerp against Emscripten with Asm.js on a bunch of small scale micro-benchmarks and against two larger scale macro-benchmarks included in the emscripten distribution.
As expected, macro-benchmarks required some amount of porting. You can find a patch that can applied to the emscripten repository to reproduce the results here.
The figure below shows the performance of these benchmarks compiled with Cheerp 1.1 and Emscripten + asm.js, relative to native. The line at the top represents the approximate performance of our previous release of Cheerp - version 1.0 on real world codebases.
Methodological note: All times are normalized to native performance. For copy the native version is built using gcc instead of clang as the code generated by clang was very poor. Benchmarks were compiled using Emscripten v 1.32.0. Execution times were evaluated using the test_benchmark.py test runner from the emscripten project in a 64-bit linux environment. Spidermonkey as shipped with Firefox 38.0.1. V8 git commit cc2d376f11503646e278ebe1e65fd3a9fee7c91e.
The first thing that jumps out is the massive improvement we have achieved since Cheerp 1.0.
Code compiled using Cheerp is always below or at the 10x slowdown compared to native, while the maximum slowdown under V8 is 6.4x. In many cases, compiled code runs below 5x native. Performances are overall very close to Emscripten on small tests (copy to skinning), and slower than Emscripten on larger scale tests (bullet and box2d).
The figure below focuses on the performance of Cheerp 1.1 compared to Emscripten on micro-benchmarks
On micro benchmarks Cheerp is between 1 and 5 times slower than native, and 1 to 2 times slower than Emscripten + Asm.js (sometimes faster, see memops under V8).
In general terms, we believe that there is small to no difference on pure computational code between Cheerp and Emscripten.
It is interesting to see that the best result comes sometimes from V8 and other times from Spidermonkey. Great food for thought for future discussions.
The figure below shows the performance results on two larger scale codebases, bullet and box2d.
Cheerp clearly has more overhead on more complex code which makes heavy use of memory, but it still somewhere around 3 to 5 times slower than Emscripten.
Moreover as you can see, in general, Cheerp performs better on Chrome (V8) than on Firefox (Spidermonkey). This is partially expected as we have implemented some optimizations which are specific for V8 architecture, but we have not yet started doing similar in depth studies on Spidermonkey.
On larger scale benchmarks making larger use of dynamic memory, Cheerp on V8 performs at less than 7x slowdown, without using static memory allocation and without using asm.js. We know that this is for the most part caused by code paths which are especially heavy on Cheerp, e.g. allocating large and complex objects inside frequently called functions.
It is important to note that these effects can be mitigated not only with better optimizations within Cheerp, but also by partially reworking the source code keeping in mind that the target architecture will be a Web browser.
While we are happy with the results obtained in the last year, we now know for sure that we can and will do much better in future releases.
Thanks to months spent on optimizing the performance of generated code we now have a clear understanding of how we can reduce this performance gap even more and get closer to native performance.
We also plan to enable at some point support for asm.js, even with its current highly limited specifications. We want to allow developers to selectively use asm.js on a type-by-type or method-by-method basis, deliberately accepting the advantages and limitations that this implies. We believe that asm.js is not needed on the whole program to achieve good performance (as this post by the Emscripten team suggests as well) and that a selective use is the best compromise.
From now on, we will also release more often so you can always access a better programming experience. Stay tuned.
Can I use it?
Cheerp is free and open source software, with some components release under the GPL license.
You are welcome to use it within the GPL restrictions.
We offer proprietary-friendly licensing as well, visit our website for details on our license model.
You can get Cheerp 1.1 release binaries on Launchpad. Take a look at our wiki for info and tutorials on Cheerp.
Cheerp is available for Linux, Windows and MacOSX.
Leaning Technologies Ltd. is the startup company behind Cheerp. Beside our C++ compiler for the Web, we provide consulting services to help your business bringing your application to the Web. For more information, drop us a line at firstname.lastname@example.org.