11

Marc-Andre Lemburg: All Things Python

 3 years ago
source link: https://www.malemburg.com/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Introduction to PyRun - Python in 3.5MB

Yesterday, we announced a new version of our open-source eGenix PyRun, the “one-file Python run-time”. So what is this “one-file Python run-time” ?

image

In 2008, eGenix started work on a product which has to run Python on Linux servers. We had looked into distributing the code just as Python files running on OS provided Python run-times, but the many different Linux distributions and their Python variants quickly caused us to reconsider the approach. Instead, we opted to shipping Python together with the product, as many companies do when distributing Python products. That way, we avoided the problem of having to adjust to all the differences found in Python installations across Linux distributions.

Now, shipping Python together with a product usually means shipping around 100MB worth of code. This may look nice from a marketing perspective (lots of code for the money), but it’s not really an ideal way of distributing products to customers.

Back to the 90s…

In the late 1990s, I had started a project called mxCGIPython. At the time, web hosters only support FTP access and Perl/shell as CGI run-time. Of course, I wanted to use Python on the hosters, so I thought to myself: wouldn’t it be great to upload a single file to the hoster’s CGI directory and then have a shell script make this executable to use as basis for CGI scripting ?

I ran some tests with simple executables and the idea actually worked pretty well.

Next, I had to turn Python together with the standard library into a single binary. Python came with a tool called freeze to create stand-alone binaries for applications, so I pointed freeze at the standard library to create such a binary.

This worked, but did require some additional tweaks to actually make the setup work. See the README of freeze to get an idea of how it works (or read the code, like I did at the time :-)).

Since I did not have access to all the different web hosting platforms, I made the project open source. People loved the idea and sent in lots of pre-compiled binaries for all kinds of platforms - covering most of the ones used by web hosters at the time.

After a few years, hosters finally caught on to also support Python as CGI platform and nowadays it’s normal to run complete web stacks using Python as implementation language.

Aside: The platform module you find in the Python standard library was the result of this project. I wanted a clean way to name the mxCGIPython binaries, so wrote the platform module as a way to come up with a standardized name.

Fast forward again…

Right, so we were looking for a solution to ship Python, but not using the 100MB heavy-weight approach. I remembered the mxCGIPython project and how small the footprint of those binaries was.

We gave it a try and, voilà, it worked great; well, after a few tweaks, of course.

Now, you might ask: why didn’t you simply freeze just the product into a single executable. The reason is simple. We wanted to be able to use this platform for future products as well and ideally be able to send out patches by just distributing ZIP files with the Python code.

And, of course, we also believe that others can make good use of the technology as well, so we improved the code, turned it into a product and open sourced it.

That’s how eGenix PyRun was born, again, from the ashes, so to speak.

Working on the UI

After a few releases, we found that installation using unzip/untar is great, but having to find the location of the distribution files for the platform is not. As a result, we added a bash script install-pyrun to take on this task, which automates the installation and also adds pip and setuptools.

First, you get the script and install is somewhere as executable:

tmp/pyrun-demo> wget https://downloads.egenix.com/python/install-pyrun
tmp/pyrun-demo> chmod 755 ./install-pyrun

Then you run it in a directory where you want the PyRun environment to be installed:

tmp/pyrun-demo> ./bin/pyrun
eGenix PyRun 2.7.10 (release 2.1.1, default, Oct  1 2015, 12:01:41)
Thank you for using eGenix PyRun. Type "help" or "license" for details.
>>> 

And that’s it.

If you want a Python 2.6 version, pass --python=2.6 to the script, for Python 3.4, use --python=3.4.

Seeing is believing

Let’s have a look at the sizes:

tmp/pyrun-demo> ls -l bin/pyrun*
-rwxr-xr-x 1 lemburg lemburg 11099374 Oct  1 12:03 pyrun2.7
-rwxr-xr-x 1 lemburg lemburg 18784684 Oct  1 12:03 pyrun2.7-debug

That’s around 11MB for an almost complete Python run-time in a single file. Not bad. But we can improve this even more by using an exe-compressor such as upx:

tmp/pyrun-demo> upx bin/pyrun2.7
        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
  11099374 ->   3549128   31.98%  linux/ElfAMD   pyrun2.7          

upx will uncompress the executable during load time, so load time increases, but it’s still impressive how small Python can get:

tmp/pyrun-demo> ls -l bin
-rwxr-xr-x 1 lemburg lemburg  3549128 Oct  1 12:03 pyrun2.7
-rwxr-xr-x 1 lemburg lemburg 18784684 Oct  1 12:03 pyrun2.7-debug

Ok, it’s not as small as Turbo Pascal was when it first hit the market with a binary of only 48k, including a compiler, editor and run-time lib, but 3.5MB is mobile app size and that alone should ring a few bells :-)

Just think of how much bandwidth you’d save compared to the 100MB gorilla, when pushing your executable to all those containers in your cluster farms in order to run your application.

To make things even easier to install, we’ve recently added an -r requirements.txt parameter to install-pyrun, so you can have it install all your dependencies together with eGenix PyRun in one go.

Some things not included in eGenix PyRun

To be fair, some shared modules from the standard library are not included (e.g. ctypes, parser, readline). install-pyrun installs them in lib/pythonX.X/lib-dynload/, so that they can optionally be used, for a total of 2.5MB in .so files.

The main purpose of eGenix PyRun is to work as run-time, so we optimized for this use. The optional shared modules can be added to the binary as well, if needed, by adding appropriate lines to the Setup.PyRun-X.X files used when building eGenix PyRun.

Anyway, give a try and let me know what you think.

Enjoy,

Marc-André


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK