495 stories
·
16 followers

start-stop-daemon: --exec vs --startas

1 Share

start-stop-daemon is the classic tool on Debian and derived distributions to manage system background processes. A typical invokation from an initscript is as follows:

start-stop-daemon \
    --quiet \
    --oknodo \
    --start \
    --pidfile /var/run/daemon.pid \
    --exec /usr/sbin/daemon \
    -- -c /etc/daemon.cfg -p /var/run/daemon.pid

The basic operation is that it will first check whether /usr/sbin/daemon is not running and, if not, execute /usr/sbin/daemon -c /etc/daemon.cfg -p /var/run/daemon.pid. This process then has the responsibility to daemonise itself and write the resulting process ID to /var/run/daemon.pid.

start-stop-daemon then waits until /var/run/daemon.pid has been created as the test of whether the service has actually started, raising an error if that doesn't happen.

(In practice, the locations of all these files are parameterised to prevent DRY violations.)

Idempotency

By idempotence we are mostly concerned with repeated calls to /etc/init.d/daemon start not starting multiple versions of our daemon.

This might not seem to be particularly big issue at first but the increased adoption of stateless configuration management tools such as Ansible (which should be completely free to call start to ensure a started state) mean that one should be particularly careful of this apparent corner case.

In its usual operation, start-stop-daemon ensures only one instance of the daemon is running with the --exec parameter: if the specified pidfile exists and the PID it refers to is an "instance" of that executable, then it is assumed that the daemon is already running and another copy is not started. This is handled in the pid_is_exec method (source) - the /proc/$PID/exe symlink is resolved and checked against the value of --exec.

Interpreted scripts

However, one case where this doesn't work is interpreted scripts. Lets look at what happens if /usr/sbin/daemon is such a script, eg. a file that starts:

#!/usr/bin/env python
# [..]

The problem this introduces is that /proc/$PID/exe now points to the interpreter instead, often with an essentially non-deterministic version suffix:

$ ls -l /proc/14494/exe
lrwxrwxrwx 1 www-data www-data 0 Jul 25 15:18
                              /proc/14494/exe -> /usr/bin/python2.7

When this process is examined using the --exec mechanism outlined above it will be rejected as an instance of /usr/sbin/daemon and therefore another instance of that daemon will be incorrectly started.

--startas

The solution is to use the --startas parameter instead. This omits the /proc/$PID/exe check and merely tests whether a PID with that number is running:

start-stop-daemon \
    --quiet \
    --oknodo \
    --start \
    --pidfile /var/run/daemon.pid \
    --startas /usr/sbin/daemon \
    -- -c /etc/daemon.cfg -p /var/run/daemon.pid

Whilst it is therefore less reliable (in that the PID found in the pidfile could actually be an entirely different process altogether) it's probably an acceptable trade-off against the case of running multiple instances of that daemon.

This danger can be ameliorated by using some of start-stop-daemon's other matching tests, such as --user or even --name.

Read the whole story
Share this story
Delete

Time for digital emancipation

1 Share

Civilization is a draft. Provisional. Scaffolded. Under construction. For example:

DEC. OF INDEP. 1

That’s Thomas Jefferson‘s rough draft of the Declaration of Independence. The Declaration hasn’t changed since July 4, 1776, but the Constitution built on it has been amended thirty-three times, so far. The thirteenth of those abolished slavery, at the close of the Civil War, seventy-seven years after the Constitution was ratified.

Today we are in another struggle for equality, this time on the Net. As Brian Grimmer put it to me, “Digital emancipation is the struggle of the century.”

There is an ironic distance between those first two words: digital and emancipation. The digital world by itself is free. Its boundaries are those of binary math: ones and zeroes. Connecting that world is a network designed to put no restrictions on personal (or any) power, while reducing nearly to zero the functional distance between everybody and everything. Costs too. Meanwhile, most of what we experience on the Net takes place on the World Wide Web, which is not the Net but a layer on top of it. The Web is built on architectural framework called client-server. Within that framework, browsers are clients, and sites are servers. So the relationship looks like this:

calf-cow

In metaphorical terms, client-server is calf-cow. Bruce Schneier gives us another metapor for this asymmetry:

It’s a feudal world out there.

Some of us have pledged our allegiance to Google: We have Gmail accounts, we use Google Calendar and Google Docs, and we have Android phones. Others have pledged allegiance to Apple: We have Macintosh laptops, iPhones, and iPads; and we let iCloud automatically synchronize and back up everything. Still others of us let Microsoft do it all. Or we buy our music and e-books from Amazon, which keeps records of what we own and allows downloading to a Kindle, computer, or phone. Some of us have pretty much abandoned e-mail altogether … for Facebook.

These vendors are becoming our feudal lords, and we are becoming their vassals.

It’s handy being a vassal. For example, you get to use these shortcuts into websites that require logins:

social-signin

To see how much personal data you risk spilling when you click on the Facebook one, visit iSharedWhat (by Joe Andrieu) for a test run. That spilled data can be used in many ways, including surveillance. The Direct Marketing Association tells us the purpose of surveillance is to give you a better “internet experience” through “interest-based advertising—ads that are intended for you, based on what you do online.” The DMA also provides tools for you to manage experiences of what they call “your ads,” by clicking on this tiny image here:

adchoicesbutton

It appears in the corners of ads from companies in the DMA’s AdChoice program. Here is one:

scottrade

The “AdChoices” text appears when you mouse over the icon. When I click on it, I get this:

scottradepopdown

Like most companies’ privacy policies, Scottrade’s says this: “Scottrade reserves the right to make changes to this Online Privacy Policy at any time.” But never mind that. Instead look at the links that follow. One of those leads to Opt Out From Behavioral Advertising By Participating Companies (BETA). There you can selectively opt out of advertising by dozens of companies. (There are hundreds of those, however. Most don’t allow opting out.)

I suppose that’s kind of them; but for you and me it’s a lot easier just to block all ads and tracking with a browser extension or add-on. This is why Adblock Plus tops Firefox’s browser add-ons list, which includes many other similar products as well. (The latest is Privacy Badger, from the EFF, which Don Marti visits here.)

Good as they are, ad and tracking blockers are still just prophylactics. They make captivity more bearable, but they don’t emancipate us. For that we need are first person technologies: ways to engage as equals on the open Net, including the feudal Web.

One way to start is by agreeing about how we respect each other. The Respect Trust Framework, for example, is “designed to be self-reinforcing through use of a peer-to-peer reputation system.” Every person and company agreeing to the framework is a peer. Here are the five principles to which all members agree:

Promise We will respect each other’s digital boundaries

Every Member promises to respect the right of every other Member to control the Member Information they share within the network and the communications they receive within the network.

Permission We will negotiate with each other in good faith

As part of this promise, every Member agrees that all sharing of Member Information and sending of communications will be by permission, and to be honest and direct about the purpose(s) for which permission is sought.

Protection We will protect the identity and data entrusted to us

As part of this promise, every Member agrees to provide reasonable protection for the privacy and security of Member Information shared with that Member.

Portability We will support other Members’ freedom of movement

As part of this promise, every Member agrees that if it hosts Member Information on behalf of another Member, the right to possess, access, control, and share the hosted information, including the right to move it to another host, belongs to the hosted Member.

Proof We will reasonably cooperate for the good of all Members

As part of this promise, every Member agrees to share the reputation metadata necessary for the health of the network, including feedback about compliance with this trust framework, and to not engage in any practices intended to game or subvert the reputation system.

The Respect Network has gathered several dozen founding partners in a common effort to leverage the Respect Trust Framework into common use. I’m involved with two of those partners: The Searls Group (my own consultancy, for which Respect Network is a client) and Customer Commons (in which I am a board member).

This summer Respect Network launched a crowd-funding campaign for this social login button:

respect-connect-button

It’s called the Respect Connect button, and it embodies all the principles above; but especially the first one: We will respect each others’ digital boundaries. This makes itthe first safe social login button.

Think of the Respect Connect button project as a barn raising. There are lots of planks (and skills) you can bring, but the main ones will be your =names (“equals names”). These are sovereign identifiers you own and manage for yourself — unlike, say, your Twitter @ handle, which Twitter owns. (Organizations — companies, associations, governments — have +names and things have *names.)

Mine is =Doc.

Selling =names are CSPs: Cloud Service Providers. There are five so far (based, respectively, in Las Vegas, Vienna, London, New York/Jerusalem and Perth):

bosonweb-logo danube_clouds-logo paoga-logo emmett_global-logo onexus-logo

Here’s a key feature: they are substituable. You can port your =name from one to the other as easily as you port your phone number from one company to another. (In fact the company that does this in the background for both your =name and your phone number is Neustar, another Respect Network partner.)

You can also self-host your own personal cloud.

I just got back from a world tour of places where much scaffolding work is taking place around this and many other ways customers and companies can respect each other and grow markets. I’ll be reporting more on that in coming posts.

 

Read the whole story
Share this story
Delete

Docker security with SELinux (Opensource.com)

1 Share
Dan Walsh looks at container security, on Opensource.com. "I hear and read about a lot of people assuming that Docker containers actually sandbox applications—meaning they can run random applications on their system as root with Docker. They believe Docker containers will actually protect their host system [...] system... Stop assuming that Docker and the Linux kernel protect you from malware."
Read the whole story
Share this story
Delete

What are useful online tools for Linux

1 Share

As you know, GNU Linux is much more than just an OS. There is literally a whole sphere on the Internet dedicated to the penguin OS. If you read this post, you are probably inclined towards reading about Linux online. Among all the pages that you can find on the subject, there are a couple […]
Continue reading...

The post What are useful online tools for Linux appeared first on Xmodulo.

Read the whole story
Share this story
Delete

green processes: multiple process architectures for everyone

1 Comment and 2 Shares
I've been rambling-writing for a couple weeks now about multi-process architecture (MPA) as seen in applications such as Akonadi or Chrome. After covering both the up- and down-sides, I also wrote a bit about missed opportunities: things that should be possible with MPA but which are rarely seen in practice. I kept promising to write about my thoughts on solutions to the downsides and missed opportunities, and that's where we are now.

The answer is deceptively simple: green processes. These are like green threads, which a few people actually mentioned in the comments sections of those past blog entries. Like green threads, green processes are mimics of native processes but run in a virtual machine (VM) instead. Some of the benefits are similar: you can have far less overhead per process (Erlang's per-process overhead is just 309 words) and you can launch thousands of them even if you only have one CPU core to work with without grinding they system to a halt. More usefully, if you have a 4 core system the VM can schedule N processes in exactly 4 native threads making full utilization of the system while minimizing things like context switch costs. (That does trivialize the complexity of the task by summarizing it in a single sentence ... )

Unlike green threads, however, green processes provide separation of memory access and require message passing rather than shared memory objects. They also more naturally follow the failure semantics of processes, such as the ability to crash. That may seem like a trivial set of differences, but the implications are hugely significant.

For instance, with crashing processes you can limit (and in many cases simply get rid of entirely) defensive programming. This requires a suitable language to go with the VM, but with such a thing in hand code can be written that only works in known-good (or white-listed) states. Everything else causes a crash, and this gets propagated appropriately through the system.

That system can include process supervisors which can restart processes or kill related ones in response to failures. It should also be noted that one still wants exceptions as those are for a rather different use case: non-fatal errors with known causes. (Essentially white-listed undesired behaviors.)

With message passing being the only way to coordinate between processes, you also are a tiny step away from distributed applications. After all, it doesn't look any different to the process in the VM if the message came from a process in the same VM or from a process elsewhere. So you can now split up your application between multiple VMs on the same system or on completely different systems connected by a network.

This could even be done at runtime or as a configuration option. Consider a system where one process needs to use unsafe native code (more on that later) and the rest simply do their thing safely in the VM. That one process could be moved to a separate VM (complete with process supervision) to keep it away from the rest of the processes. If the native code takes down that process or the native code leads to a security vulnerability, you now have process separation at the native OS level. If that process only causes problems on some systems (say, a graphical app being run on systems with a particular vendors particularly crashtastic GPU drivers), then it could be moved out of process only such systems and kept in-process otherwise.

Due to the lightweight nature of green processes one can afford to break an application up into a massively concurrent system. Each bit of I/O can live in its own process, for instance. Spawning a process for each socket or file operation becomes reasonable. Synchronous finite state machines (FSM), very useful things that, when not forced into asynchronous behavior due to event loops, are very simple to write and manage, become second nature.

A VM with green processes ends up resolving, or at least noticeably improving on, nearly all the downsides of native processes while also making most of the missed opportunities low-hanging fruit, sometimes even for free (as seen with distributing processes). What impact could this have?

Let's look at Akonadi. It could be a hundred processes instead of the two dozen it currently is on my system (I have a lot of accounts ...), making the code much simpler in the process (no more async job and async FSM implementations) while having dramatically lower overhead. Not only would the per-process cost be lower, even with message passing the cost would be truly zero as objects (not just serialized memory buffers!) are passed using copy-on-write (COW) and/or reference counted storage. On smaller systems with only one or two cores, the entire application would automatically "collapse" down to a single-threaded application that would be nearly identical in nature (and therefore also in performance) to a purposefully single-threaded, single-process, async event driven application. This is a new avenue for scalablility. Another one would be in how remote Akonadi processes would come alone for free; want to run Akonadi on a server but access the data locally? Want to spread your Akonadi processes across a network of systems? No problem. Meanwhile, all the benefits of the multi-process system it currently enjoys, including robustness, would be retained.

Or how about Plasma. Every DataEngine, every model and every widget could live in its own process in the VM. Since it looks like one process to the native OS (because it is), the actual UI could be delegated to a specific thread and even on x.org it could paint all day even if a DataEngine or other component were to die. Installing a new Plasmoid, even one with compiled code, could be hot-reloaded without restarting the desktop shell. KRunner could be sucked into the main plasma-shell process without risking stability or robustness, shrinking the overall footprint of the system. In fact, allof the separate processes could be pulled into the same VM while retaining allof the current benefits, including re-use on other systems that don't want plasma-shell.

It isn't all picnic baskets and fluffy white clouds, however.

Running natively compiled C/C++/ASM code in your VM will crash the entire VM when that code segfaults. So if one were to use Qt, for example, from such a VM any crash in Qt would bring down the whole thing. The benefits are only offered to code running in the VM. Perhaps using multiple VMs for just those parts of the system would be a solution in some cases. Given the message passing of such a system, it is possible to even run native code applications external to the VM and have them appear inside the VM as any other local-to-the-VM process (to other processes in VM, anyways) by grafting the message passing protocol onto that native application.

C/C++ isn't particularly suited to being run in such a VM, at least not with today's tools. So either new languages are needed or we need new tools. Or both. A new language for such a VM would not be the worst of ideas, actually: look how QML has vastly improved life in the UI world. Similarly, a properly designed language could make application development much safer, much easier and much faster. (To those who live-and-breath Python/Ruby/Perl/etc. that's not news, I'm sure.) That does mean that some amount of application code would need to be rewritten to take advantage of such a VM. This is probably desirable in any case since most applications will need rethinking in terms of massive concurrency. If the VM supports easily binding in existing native code, this could even be done in stages. We know that can work as this is exactly how Plasma was transitioned from imperative QGraphicsView code to declarative QML code. Still, there would be a transition period.

Finally, such a VM is not trivial to do correctly. It ends up becoming a micro-operating-system in its own right with memory management and process schedulers. Care needs to be paid to how messages are passed to keep overhead low there, as well. Making such a thing would be a significant investment. One could almost certainly start with existing art, but a lot of work would still be required.

Speaking of existing art: as I was considering the "after convergence" questions earlier this year, I ended up looking around for systems already tackling the problems I saw related to MPA and I came across Erlang which does nearly all of the above. So if you are interested in seeing what this can look like in practice, I definitely recommend looking into it. It is not a perfect system, but it is about the only one I could find that wasn't an academic project that addressed these issues head-on. I think the original designers of the runtime were way ahead of their time.

Erlang itself is probably not 'the answer', however. It is an amazing system that is just fantastic for server side work. (On that note, I think it is a much better fit for Akonadi than C++ ever could be, or for Bodega than node.js ever could be.) However, it lacks some of the features I'd really like to see in such a system such as cross-VM process migration (non-trivial to implement properly, granted), it is not well suited to numeric processing (not an issue for most data-centric servers) and has some legacy issues related to things like strings (though that has come leaps and bounds in recent years as well). I don't care about syntax warts as they aren't that bad in Erlang and if you think they are then simply use the rather nicer Elixir that runs on the same VM.

However, a system designed specifically for the needs of client-side user interface code could be marvelous. Systems under the application would remain in C/C++ and the UI should be done declaratively (with e.g. QML), but the application could be something else entirely that makes writing the applications in between those two parts far easier and ...

... more importantly than easier development, gift them with super-powers current systems are unlikely to ever offer. Multi-process architecture, despite the inherent benefits like increased application robustness, is really just a stepping stone towards being able to do more interesting things.

With the premise of a massively-concurrent system with minimized costs and a consistent message passing system in place as axiomatic assumptions for future applications, we can begin to explore new possibilities that open up.

You see, after convergence, which allows us to forget that devices are somehow different from each other (without hobbling the user interface into stupidity in the process), we can start thinking about how to use all those devices as a single fabric, be they remote servers, you and your friend's phones, your phone and your boss' laptop .. be they located on the same or different network segments .. be they in a consistent or a dynamic set of states ..

Still, before plunging in that direction conceptually, there are useful insights remaining to be brought into this house of thought. Next I will be exploring methods of authentication.

Until then, consider this question: How many separate online accounts do you have?

(I bet almost everyone will get the answer to that one "wrong". *devilish smile*)
Read the whole story
jepler
11 days ago
reply
green processes = in-process sandboxing, and that's the model we decided *didn't* work.
Earth, Sol system, Western spiral arm
Share this story
Delete

opportunities presented by multi-process architectures

1 Share
So after stating that I was thinking about what comes "after convergence" I rambled on a bit about social networks and the cloud before veering wildly towards applications written as sets of processes, or multi-process architectures (MPA). I wrote about the common benefits of MPA and then the common downsides of them. Before trying to pull this full-circle back to the original starting point all those blog entries ago, I need to say a few things about opportunities presented by MPA systems that are rarely taken advantage of and why that is.

I don't want to give the impression that I think I've come up with the ideas in this blog entry. Most already exist in production somewhere in the world out there. There are certainly great academic papers to be found on each of the topics below. It just seems to be a set of topics that is not commonly known, thought about or discussed, particularly outside the server room. So nothing here is new, though I suspect at least some of the ideas might be new to many of the readers.

Before getting into the topics at hand, I'd also like to give a nod to the Erlang community. When I started toying about with the questions that inspired this line of though, I went looking for solutions that already existed. There almost always is someone, somewhere working on a particular way to approach a given problem. This is the beauty of a global Internet with so many people trained to work with computers. While looking around at various things, Erlang caught my eye as it is designed for radical MPA applications. While messing around with Erlang I began to understand just how desperately little most applications were getting out of the MPA pattern, and it helped illuminate new possible paths of exploration to tough challenges we face such as actually getting the most out of "convergence" when the reality is people have multiple devices (if only because each person has their own device). It's pretty amazing what many of the people in that community are up to .. for instance:


Hot Upgrades

May as well start with this one since I posted that video above; that way I can pretend it was a clever segue instead of just nerd porn. ;) Typically when we think of upgrading an application, we think of upgrading all of it. We also generally expect to have to restart the application to get the benefits of those upgrades. Our package managers on Linux often tell us that explicitly, in fact.

There are exceptions to this. The most obvious example is plugins: you can change a plugin on disk and, at least on next start, you get new functionality without touching the rest of the installed application components. When the plugins are scripted, it becomes possible to do this at runtime. 

In fact, sometime between Plasma version 4.2 and 4.4 I worked up a patch that reloaded widgets written in Javascript when they changed on disk. With that patch, the desktop could just continue on and pieces could be upgraded here or there. I never merged it into the mainline code as I wasn't convinced it would be a commonly useful thing and there was some overhead to it all.

Generally, however, applications are upgraded monolithically. This is in large part because componentization, MPA or not, is not as common as it ought to be. It's also in part because components rely on interfaces and too many developers lack the discipline to keep those interfaces consistent over time. (The web is the worst for this, but every time I see Firefox checking plugins after a minor version upgrade I shake my head.) It also doesn't fit very nicely into the software delivery systems we typically use, all of which tend to be built around the application level in granularity.

Those are all solvable issues. What is less obvious is how to handle code reloading at runtime when the system is in active use. When upgrading a single component while the application is running, how does one separate it safely from the rest of the running code to perform an in-place update of the code to be executed?

With an MPA approach the answer seems fairly straight-forward: bring up a new process, switch to using the new process, kill the old one. Since the processes are already separated from each other, upgrading one component on the fly ought to be possible without fiddling with the rest of the application. It's less straight-forward in practice, however, as one needs to handle any tasks the old process is still processing, any pending messages sitting in the old process' queue (or whatever is being used for message passing) and handing over all messaging connections to the new process. It is possible though, as the video above demonstrates, though even in Erlang where this was designed into the system one needs to approach with a measure of thoughtfulness.

This is an uncommon feature because it is not easy to get right and pretty well all the existing systems out there lack any support for it whatsoever. However, if we want applications that run "forever", then this needs to become part of our toolbox. Given that the Linux kernel people have been working on addressing this issue (granted, for them restarting the "application" means a device restart which is even more drastic than restarting a single application), our applications ought to catch up.

Process hibernation

Most MPA applications spin up processes and let them sit there until the process is finished being useful. For things like web and file servers using this design, the definition of "useful" is pretty obvious: when the client connection drops and/or the client request is done being handled, tear down the process. 

Even then there is a fly in the ointment: spinning up a process can take more time than one wants and so perhaps it makes sense to just have a bunch of processes sitting there laying in wait to process incoming requests one after the other. Having just started apache2 on my laptop to check that it does what I remember it doing, indeed I have a half-dozen httpd-prefork processes sitting around doing nothing but wasting resources. The documentation says: "Apache always tries to maintain several spare or idle server processes, which stand ready to serve incoming requests. In this way, clients do not need to wait for a new child processes to be forked before their requests can be served."

Keeping processes around can also make it a lot easier when creating an MPA structure: spin up everything you need and let it sit around. This is a great way to avoid complex "partial ready" states in the application.

With multiple processes, one would hope it would be possible simply hibernate a specific process on demand. That way it exists, for all intents and purposes, but isn't using resources. This keeps the application state simple and can avoid at least some of the process spin-up costs. Unfortunately, this is really, really hard to do with 100% fidelity. People have tried to add this kind of feature to the Linux kernel, but it never has made it in. One of the biggest challenges in handling files and sockets sensibly during hibernation.

This article on deploying node.js applications with systemd is interesting in its own rights, but also shows that people really want this kind of functionality. Of course, the article is just showing a way to stop and start a process based on request activity which isn't quite the same thing at all.

Surprise: Erlang actually has this covered. It isn't a magical silver bullet and you don't want to use it with processes that are under constant usage (it has overhead; this kind of feature necessarily always will), but it works reliably and indeed can help with resource usage.

Every time I look at all those processes in Akonadi just sitting there when they know they won't be doing a thing until the next sync or mail check I despair just a little. Every time I notice that Chrome has a process running for those tabs showing static content, chewing up 15-30 MB of memory each, I wish process hibernation was common.

Radical MPA

If you look through the code of Akonadi resources, such as the imap resource in kdepim-runtime/resources/imap, one will quickly notice that they are really rather big monolithic applications. The imap resource spins up numerous finite state machines (though they often aren't implemented as "true" FSMs, but as procedural code that is setting and checking state-value-bearing variables everywhere) and handles large number of asynchronous jobs within its monolith. As a result it is just over ten thousand lines of code, and that isn't even counting the imap or the Akonadi/KDE generic resources libraries it uses. That's a large chunk of code doing a lot of things.

That's kind of odd. Akonadi is a MPA application, but its components are traditional single-process job multiplexers. This is through no fault or shortcoming of Akonadi or its developers. In fact, I feel kind of bad for referencing Akonadi so often in these entries because it is actually an extremely well done piece of software that has matured nicely by this point. It's just one of the few MPA applications written for the desktop in wide usage that one can't blame these kinds of warts on anything other than the underlying language and frameworks ... it's because Akonadi is good that I keep bringing it up as an example, as it highlights the limits of what is possible with the way we do things right now. Ok, enough "mea culpa" to the Akonadi team ... ;)

It would be very cool if the complex imap resource was itself an MPA system. State handling would dramatically simplify and due to this I am pretty sure a significant percentage of those 10,000+ lines of code would just vanish. (I took another drift through the code base of the imap resource this morning while the little one had a nap to check on these things. :)

Additionally, if it was "radically" MPA then a lot of the defensive programming could simply be dropped. Defensive programming is what most of us are quite used to: call a function, check for all the error states we can think of and respond accordingly and only then handle the "good" cases. This is problematic as humans are pretty bad at thinking of all possible error states when systems move beyond "trivial" in complexity. With radically MPA systems, however, each job can be handled by a separate process and should something other than success happens it can simply halt. No state needs to be preserved or reset; at most the other processes that are waiting for news back on how the job went may want to be informed that something went sideways so they can also either halt or continue on. (Yes, logging, too.) This not only makes the code much easier for humans to write, as one only needs to write for what is expected rather than try and list all things that are not unexpected, but makes the code base radically smaller as most of the "if (failureCondition)" branches that pepper our code today simply melt away.

This doesn't happen because processes are expensive and creating frameworks that can handle such radical MPA systems are hard to write and few pre-made ones exist.

Supervision

With any MPA system, but particularly radically MPA applications, the opportunity for process supervision arises. Supervision is when one process watches other processes and decides their fate: when to start them, when to stop them, when (and how!) to restart them on failure.

I first became interested in the possibilities for system robustness due to supervision when systems such as systemd started coming together. systemd, however, falls wildly short of the possibilities. For what is perhaps the definitive example of supervision one ought to look to Erlang.

In Erlang applications, which are encouraged to be MPA, one defines a supervision tree. Not only can you have individual processes supervised for failure (e.g.), but you can have nested trees of supervisors each with a policy to follow when processes are created and when they fail. You can, for instance, tell the supervisor that a particular group of processes all rely on each other and so should one fail they should all fail and be restarted. You can define timeouts for responsiveness, how many times to restart processes and such things.

This allows one to define the state of the full set of processes in a robust manner, from process execution through to finality. This is a key to robust, long-lived appliations.

Processing non-locality

With the MPA model it is trivial to spread a task out across multiple machines. This is extremely common in the server room when it comes to large applications and as such is quite well understood. There are large numbers of libraries and frameworks that make this easier, from message queueing systems to service discovery mechanisms. The desirability of this is quite obvious: finishing large tasks quickly is often beyond the reach of individual machines. So put a bunch of them together and figure out how to make them work together on problems. Voila, problem solved. (Writing blogs is fun: you can make the amazingly difficult and complex appear in a puff of magic smoke with a simple "voila" .. ;)

Outside the server room this is hardly ever used, however. MPA systems aimed at non-server workloads tend to assume they all run on the same system. Well, we live in a different world than we did twenty years ago. 

People have multiple rather powerful computers with them. Right now I have my laptop, a tablet, two ARM devices and a smartphone. The printer sitting next to me isn't very powerful, but it runs a full modern OS as well.

We also routinely use services that exist on other machines, but instead of letting processes coordinate as we would if they were local we tend to blast data around in bulk or create complex protocols that allow the local machine to dictate to the remote machine what we'd like it to do for us. The protocols tend to result in multiple layers of indirection on the remote side: JSON over HTTP hits a web server which forwards the request to a bit of scripted code that interacts with the server to construct some sort of response which it then encodes into JSON and sends back over HTTP to be reconstituted on the local side ... and should the client ever want something ever so slightly different, too bad.

This makes using two end-user devices together really rather difficult. The byzantine client-server architecture ensures that it is non-trivial to do simple things and that significant pieces of software must be run on any devices one wishes to coordinate. People are aware of the annoyances: Apple has Handoff, KDE has KDE Connect, .. but what happens if I'm running Akonadi (that guy again!) on my Plasma Desktop system and I'd like it to be able to find results from data on my Plasma Active tablet? Yeah, not simple to solve .. unless Akonadi could run agents remotely as transparently as it can locally. Which would mean more complexity in the Akonadi code base, and every other MPA app that might benefit from this. As it is additional functionality and probably nobody has yet hit this use case, Akonadi is not capable of it and the use case is likely to never be fulfilled once people run into it.

It is the combination of thinking with blinders on ("JSON over HTTP.. merp merp!") and the difficulty level of "remote processes are the same as local processes" that prevents this from happening in places we could benefit from.

Process migration

This goes hand-in-hand with the remote process possibility, but takes it up a notch. In addition to having remote processes, how amazing would it be to be able to migrate processes between machines? This also has been done; it's a pretty common pattern in big multi-agent systems from what I understand. This carries a lot of the same requirements and challenges of process hibernation, and probably would only be possible with specially designated processes. Security is another concern, but not an insurmountable obstacle.

Along with lack of remote processes, this is probably the sole reason it is so hard to transfer the picture from your phone to your laptop: byzantine structures must exist on all systems to do the simplest of things, and every time we wish to do a new simple thing that byzantine structure needs an upgrade, typically on both sides. How awful. This is probably also my cue to moan about Bluetooth, but this blog entry is already long enough.

Security

Many MPA applications already get a security leg up by having their applications separated by the operating system. They can't poke around at each other's memory locations, crashing doesn't take down the whole system, etc. This is really just a few tips of what is really a monumental iceberg, however.

It ought to be possible to put each process in its own container with its own runtime restrictions. The Linux middleware people working on things like kdbus and systemd along with the plethora of containerization projects out there are racing to bring this to individual applications. MPA applications ought to take advantage of these things to keep attack vectors down while allowing each process in the swarm access to all the functionality it needs. 

Combined with radical MPA, this could be a significant boost: imagine all external input (e.g. from the network) being processed into safe, internal data structures in a process that runs in its own security context. Should something nasty happen, like a crash, it has access to nothing useful.

OpenSSH already went in this direction many years ago with privelege separation, but with modern containerization we could take this to a whole new level that would be of interest to many more applications.

Again, it means having the right frameworks available to make this easy to do.

... in conclusion

So we've now seen the common benefits of MPA, the common downsides of MPA and finally some more exotic possibilities which are rarely taken advantage of but would be very beneficial as common patterns.

Next we will look at how to bring this all together and attempt to describe a more perfect system which avoids the negatives and offers as many of the positives as possible.

The ultimate goal is completely robust applications with improved security and flexibility that can work more naturally (from a human's perspective) in multi-device environments and, perhaps, allow us to start tackling the more thorny issues of privacy and social computing.
Read the whole story
Share this story
Delete
Next Page of Stories