Thomas Guest: Aligning the first line of a triple-quoted string in Python

Python’s triple-quoted strings are a convenient syntax for strings where the contents span multiple lines. Unescaped newlines are allowed in triple-quoted strings. So, rather than write:

song = ("Happy birthday to you\n"         "Happy birthday to you\n"         "Happy birthday dear Gail\n"         "Happy birthday to you\n") 

you can write:

song = """Happy birthday to you Happy birthday to you Happy birthday dear Gail Happy birthday to you """ 

The only downside here is that the first line doesn’t align nicely with the lines which follow. The way around this is to embed a \newline escape sequence, meaning both backslash and newline are ignored.

song = """\ Happy birthday to you Happy birthday to you Happy birthday dear Gail Happy birthday to you """ 

Planet Python

Python Sweetness: Threadless mode in Mitogen 0.3

Mitogen has been explicitly multi-threaded since the design was first conceived. This choice is hard to regret, as it aligns well with the needs of operating systems like Windows, makes background tasks like proxying possible, and allows painless integration with existing programs where the user doesn’t have to care how communication is implemented. Easy blocking APIs simply work as documented from any context, and magical timeouts, file transfers and routing happen in the background without effort.

The story has for the most part played out well, but as work on the Ansible extension revealed, this thread-centric worldview is more than somewhat idealized, and scenarios exist where background threads are not only problematic, but a serious hazard that works against us.

For that reason a new operating mode will hopefully soon be included, one where relatively minor structural restrictions are traded for no background thread at all. This article documents the reasoning behind threadless mode, and a strange set of circumstances that allow such a major feature to be supported with the same blocking API as exists today, and surprisingly minimal disruption to existing code.

Recap

Above is a rough view of Mitogen’s process model, revealing a desirable symmetry as it currently exists. In the master program and replicated children, the user’s code maintains full control of the main thread, with library communication requirements handled by a background thread using an identical implementation in every process.

Keeping the user in control of the main thread is important, as it possesses certain magical privileges. In Python it is the only thread from which signal handlers can be installed or executed, and on Linux some niche system interfaces require its participation.

When a method like remote_host.call(myfunc) is invoked, an outgoing message is constructed and enqueued with the Broker thread, and a callback handler is installed to cause any return value response message to be posted to another queue created especially to receive it. Meanwhile the thread that invoked Context.call(..) sleeps waiting for a message on the call’s dedicated reply queue.

Latches

Those queues aren’t simply Queue.Queue, but a custom reimplementation added early during Ansible extension development, as deficiencies in Python 2.x threading began to manifest. Python 2 permits the choice between up to 50 ms latency added to each Queue.get(), or for waits to execute with UNIX signals masked, thus preventing CTRL+C from interrupting the program. Given these options a reimplementation made plentiful sense.

The custom queue is called Latch, a name chosen simply because it was short and vaguely fitted. To say its existence is a great discomfort would be an understatement: reimplementing synchronization was never desired, even if just by leveraging OS facilities. True to tribal wisdom, the folly of Latch has been a vast time sink, costing many days hunting races and subtle misbehaviours, yet without it, good performance and usability is not possible on Python 2, and so it remains for now.

Due to this, when any thread blocks waiting for a result from a remote process, it always does so within Latch, a detail that will soon become important.

The Broker

Threading requirements are mostly due to Broker, a thread that has often changed role over time. Today its main function is to run an I/O multiplexer, like Twisted or asyncio. Except for some local file IO in master processes, broker thread code is asynchronous, regardless of whether it is communicating with a remote machine via an SSH subprocess or a local thread via a Latch.

When a user’s thread is blocked on a reply queue, that thread isn’t really blocked on a remote process – it is waiting for the broker thread to receive and decode any reply, then post it to the queue (or Latch) the thread is sleeping on.

Performance

Having a dedicated IO thread in a multi-threaded environment simplifies reasoning about communication, as events like unexpected disconnection always occur in a consistent location far from user code. But as is evident, it means every IO requires interaction of two threads in the local process, and when that communication is with a remote Mitogen process, a further two in the remote process.

It may come as no surprise that poor interaction with the OS scheduler often manifests, where load balancing pushes related communicating threads out across distinct cores, where their execution schedule bears no resemblance to the inherent lock-step communication pattern caused by the request-reply structure of RPCs, and between threads of the same process due to the Global Interpreter Lock. The range of undesirable effects defies simple description, it is sufficient to say that poor behaviour here can be disastrous.

To cope with this, the Ansible extension introduced CPU pinning. This feature locks related threads to one core, so that as a user thread enters a wait on the broker after sending it a message, the broker has much higher chance of being scheduled expediently, and for its use of shared resources (like the GIL) to be uncontended and exist in the cache of the CPU it runs on.

Runs of tests/bench/roundtrip.py with and without pinning.
Pinned? Round-trip delay
No

960 usec

Average 848 usec ± 111 usec

No

782 usec

No

803 usec

Yes

198 usec

Average 197 usec ± 1 usec

Yes

197 usec

Yes

197 usec

It is hard to overstate the value of pinning, as revealed by the 20% speedup visible in this stress test, but enabling it is a double-edged sword, as the scheduler loses the freedom to migrate processes to balance load, and no general pinning strategy is possible that does not approach the complexity of an entirely new scheduler. As a simple example, if two uncooperative processes (such as Ansible and, say, a database server) were to pin their busiest workers to the same CPU, both will suffer disastrous contention for resources that a scheduler could alleviate if it were permitted.

While performance loss due to scheduling could be considered a scheduler bug, it could be argued that expecting consistently low latency lock-step communication between arbitrary threads is unreasonable, and so it is desirable that threading rather than scheduling be considered at fault, especially as one and not the other is within our control.

The desire is not to remove threading entirely, but instead provide an option to disable it where it makes sense. For example in Ansible, it is possible to almost halve the running threads if worker processes were switched to a threadless implementation, since there is no benefit in the otherwise single-threaded WorkerProcess from having a distinct broker thread.

UNIX fork()

In its UNIX manifestation, fork() is a defective abstraction protected by religious symbolism and dogma, conceptualized at a time long predating the 1984 actualization of the problem it failed to solve. A full description of this exceeds any one paragraph, and an article in drafting since October already in excess of 8,000 words has not yet succeeded in fully capturing it.

For our purposes it is sufficient to know that, as when mixed with most UNIX facilities, mixing fork() with threads is extremely unsafe, but many UNIX programs presently rely on it, such as in Ansible’s forking of per-task worker processes. For that reason in the Ansible extension, Mitogen cannot be permanently active in the top-level process, but only after fork within a “connection multiplexer” subprocess, and within the per-task workers.

In upcoming work, there is a renewed desire for a broker to be active in the top-level process, but this is extremely difficult while remaining compatible with Ansible’s existing forking model. A threadless mode would be immediately helpful there.

Python 2.4

Another manifestation of fork() trouble comes in Python 2.4, where the youthful implementation makes no attempt to repair its threading state after fork, leading to incurable deadlocks across the board. For this reason when running on Python 2.4, the Ansible extension disables its internal use of fork for isolation of certain tasks, but it is not enough, as deadlocks while starting subprocesses are also possible.

A common idea would be to forget about Python 2.4 as it is too old, much as it is tempting to imagine HTTP 0.9 does not exist, but as in that case, Mitogen treats Python not just as a language runtime, but as an established network protocol that much be complied with in order to communicate with infrastructure that will continue to exist long into the future.

Implementation Approach

Recall it is not possible for a user thread to block without waiting on a Latch. With threadless mode, we can instead reinterpret the presence of a waiting Latch as the user’s indication some network IO is pending, and since the user cannot become unblocked until that IO is complete, and has given up forward execution in favour of waiting, Latch.get() becomes the only location where the IO loop must run, and only until the Latch that caused it to run has some result posted to it by the previous iteration.

@mitogen.main(threadless=True) def main(router):     host1 = router.ssh(hostname='a.b.c')     host2 = router.ssh(hostname='c.b.a')      call1 = host1.call_async(os.system, 'hostname')     call2 = host2.call_async(os.system, 'hostname')      print call1.get().unpickle()     print call2.get().unpickle() 

In the example, after the (presently blocking) connection procedure completes, neither call_async() wakes any broker thread, as none exists. Instead they enqueue messages for the broker to run, but the broker implementation does not start execution until call1.get(), where get() is internally synchronized using Latch.

The broker loop ceases after a result becomes available for the Latch that is executing it, only to be restarted again for call2.get(), where it again runs until its result is available. In this way asynchronous execution progresses opportunistically, and only when the calling thread indicated it cannot progress until a result is available.

Owing to the inconvenient existence of Latch, an initial prototype was functional with only a 30 line change. In this way, an ugly and undesirable custom synchronization primitive has accidentally become the centrepiece of an important new feature.

Size Benefit

The intention is that threadless mode will become the new default in a future version. As it has much lower synchronization requirements, it becomes possible to move large pieces of code out of the bootstrap, including any relating to implementing the UNIX self-pipe trick, as required by Latch, and to wake the broker thread from user threads.

Instead this code can be moved to a new mitogen.threads module, where it can progressively upgrade an existing threadless mitogen.core, much like mitogen.parent already progressively upgrades it with an industrial-strength Poller as required.

Any code that can be removed from the bootstrap has an immediate benefit on cold start performance with large numbers of targets, as the bottleneck during cold start is often a restriction on bandwidth.

Restrictions

Naturally this will place some restraints on execution. Transparent routing will no longer be quite so transparent, as it is not possible to execute a function call in a remote process that is also acting as a proxy to another process: proxying will not run while Dispatcher is busy executing the function call.

One simple solution is to start an additional child of the proxying process in which function calls will run, leaving its parent dedicated just to routing, i.e. exclusively dedicated to running what was previously the broker thread. It is expected this will require only a few lines of additional code to add support for in the Ansible extension.

For children of a threadless master, import statements will hang while the master is otherwise busy, but this is not much of a problem, since import statements usually happen once shortly after the first parent->child call, when the master will be waiting in a Latch.

For threadless children, no background thread exists to notice a parent has disconnected, and to ensure the process shuts down gracefully in case the main thread has hung. Some options are possible, including starting a subprocess for the task, or supporting SIGIO-based asynchronous IO, so the broker thread has can run from the signal handler and notice the parent is gone.

Another restriction is that when threadless mode is enabled, Mitogen primitives cannot be used from multiple threads. After some consideration, while possible to support, it does not seem worth the complexity, and would prevent the previously mentioned reduction of bootstrap code size.

Ongoing Work

Mitogen has quite an ugly concept of Services, added in a hurry during the initial Ansible extension development. Services represent a bundle of a callable method exposed to the network, a security policy determining who may call it, and an execution policy governing its concurrency requirements.

Despite heavy use, it has always been an ugly feature as it partially duplicates the normal parent->child function call mechanism. Looking at services from the perspective of threadless mode reveals some notion of a “threadless service”, and how such a threadless service looks even more similar to a function call than previously.

It is possible that as part of the threadless work, the unification of function calls and services may finally happen, although no design for it is certain yet.

Summary

There are doubtlessly many edge cases left to discover, but threadless mode looks very doable, and promises to make Mitogen suitable in even more scenarios than before.

Until next time!

Just tuning in?

Planet Python

gamingdirectional: Detect the player’s boundary

In this article, we will start to create the boundary detection mechanism which can be used to help the boy moving around the canvas. We will go slowly where this topic will take a few chapters to complete. In this chapter, we will focus on below issues. The boy will not be able to move past the horizontal boundary of either 0 or 576 pixels which is the physical boundary for the boy sprite.

Source

Planet Python

Talk Python to Me: #199 Automate all the things with Python at Zapier

Do your applications call a lot of APIs? Maybe you have a bunch of microservices driving your app. You probably don’t have the crazy combinatorial explosion that Zapier does for connecting APIs! They have millions of users automating things with 1,000s of APIs. It’s pretty crazy. And they are doing it all with Python. Join me and Bryan Helmig, the CTO and co-founder of Zapier as we discuss how they pull this off with Python.
Planet Python

Digital Nomad: A Year in the Life

Carl digital nomad

In the musical Rent, the song “Seasons of Love” asks us how we measure a year: in daylights, in sunsets, in cups of coffee, in inches, in miles, in laughter, in strife? Now, I’m certain with enough patience and energy, we could track all of this by doing some more rigorous quantified-self work. Unfortunately, I was not so exact or rigorous with my measurements throughout the year, so I’ve only tracked where I’ve been and how I got there. I travel for work and leisure extensively, and as of November 25, 2017, I’ve been a traveller, a person of no fixed abode, a gypsy. I don’t have a flat or a place to call home.

Beginning My Nomadic Existence

This started because in November 2017, my lease was ending, and I knew I would be travelling for all of December. I had a few gigs in the UK, the company Winter Summit in Morocco (yeah, that was pretty cool) and a two-week trip scheduled to go back home to the US. In total, I had two or three days unaccounted for. I thought I’d save a bunch of money by leaving my flat and moving my property into a storage unit, and then in 2018, I’d begin the search for a new flat, get a fresh start and try out a new area of London.

I was only supposed to do this for one month, but I was flexible. At the Summit, my manager asked me if I’d be okay with being sent to Paris for a client for about three or four months. I mentally figured I’d not really have time to search for a new flat with the schedule and time and said I’d do it, if I could stay over in Paris or travel on the weekends—as long as the flight costs were comparable to my normal flight “home” to London.

How One Month Became One Year

Starting in December, I was living in an amazing hotel just outside of Paris. I had a large bag (probably the only time in a year I didn’t just have carry-ons) that functioned as a wardrobe. On weekends, I’d leave the big bag and travel with just my duffel all over Europe.

When the Paris gig was just finishing, I had a couple of weeks before my two-week vacation. When I came back from the mountains (I hiked Peru!), I was on yet another long-term project, this one based outside of Amsterdam. From the very start of June until mid-October, I was there, and again, I had the same plan of weekend trips and stay-overs.

Measuring a Year in the Life of Digital Nomad

Since October, I’ve had a few weeks of vacation in Poland and South Tanzania, a few on-site gigs scattered all over (Ireland, Germany and Dubai), and a lot of remote work spent in hotels in interesting places or while couch-surfing at my friends’ homes. Most of this time, I’ve travelled with a small carry-on duffel bag: you need a lot less than you think you do.

So how do I measure a year? Here’s how I did it: in 106 flights and about 315 hours in the air; 322 nights in a hotel or hostel; 23 countries visited; and 38 days crashing on the couches of family or friends.

What I can’t show with data is just how great and rewarding this year has been. I’ve seen sights all over the globe and spent time with friends scattered around the world. I’ve been challenged with clients and done the entire gamut of business-intelligence services that we offer, from database work and dashboarding to portals and analytics. I’ve been paid to grow, travel, explore and live my best life. Simply put, when measuring this year, it’s been a success.

Some of My Most Memorable Moments

Some things that were the most memorable are below. I took the liberty of dropping in a few pictures of 2017 as they help show the type of travel I’m doing:

I went to India for work in early 2017. I was able to take some time afterwards to do some sight-seeing. Trust me, I wanted to be smiling but it was over 40°C and humid. I was happy, but hot. Coincidentally, I’m currently back in India on a month-long working gig:

Carl digital nomad India

On my birthday weekend in 2017, I was in Scotland for work, which suited me just fine. I went to the Cairngorm’s around Aviemore and went fly fishing and was able to practice my Spey cast in the River Spey:

Carl digital nomad

And, of course, there was Paris. Pictures are of that famous tower thing and a good friend—now a former colleague—on his last day in Paris with me:

Carl digital nomad

Then there’s Rome where I spent a weekend exploring, eating gelato and pizza pretty much non-stop:

Carl digital nomad Rome

This was one of my more unique experiences. I was in Lebanon and had a free afternoon, and a jazz festival was about to be put on in this square. So I grabbed a seat and listened for the day while eating my way through half the menu:

Carl digital nomad Lebanon

In Peru, I did the tourist thing and hiked Machu Picchu, but this was my favourite shot. Directly beneath the dog is a fairly sheer cliff about 100-200m down. It’s the horseman’s dog, and I’m sad that I cannot remember his name:

Carl digital nomad Peru

There were two team gatherings this year: one with just the team I’m on (Team Cloudripper) and the next with the entire company for the summer meet-up. I manned the grill:

Carl digital nomad team meetups

This was taken just before a thunderstorm in Leiden. I made some new friends in Leiden and because I was working so close for a few months, I was able to spend a lot of time with them and became a better person through it:

Carl digital nomad Leiden

A photo of me hiking the Lauren Forest in Madeira (it looks like a Pterodactyl could swoop in at any moment):

Carl digital nomad Madeira

My view while chilling on a boat in Malta:

Carl digital nomad Malta

These two are just before sunrise on the northern point of Zanzibar and beers in the Czech Republic:

Carl digital nomad Czech Republic

Tips and Tricks for Would-be Digital Nomads

  1. You need way fewer things than you think you do. Underwear and socks are cheap; you can always buy more. You can do laundry in your sink if you’re hard-pressed enough, or you can find a dry-clean service (most of them do a reasonably priced wash-and-fold service).
  2. Have a back-up plan for everything and be flexible. It’s not always a guarantee that something will work out. Try to have a contingency plan in place, or put yourself in a major city in the daytime, not late at night if you know you need to travel. Within Europe, you can usually get a flight, train or bus even fairly last minute for a decent cost.
  3. You’re going to spend a lot on food and drinks. Every meal is pretty much going to be eaten out. This adds up far more than you will probably budget for. Accept this early on. You’re saving on rent, after all. 😊
  4. You’re going to lose things. This year, I’ve lost clothing all over: a nice winter coat, a Kindle, my custom sunglasses and two very large power banks. Sigh. Move on and buy new ones.
  5. Hone the skill of sleeping upright. This will come in handy on a plane, train or bus, and the extra sleep will go a long way. You’ll spend a lot of time travelling, easily 10-20 hours per week, and it will help you out tremendously to learn how to live (and sleep) flexibly.

Your Adventure Awaits

While it has its challenges, the life of a digital nomad is truly the life for me. I wouldn’t have it any other way. If you think it could be the right fit for you, we’re hiring, so check out our available positions. Your greatest adventure could be ahead.

The post Digital Nomad: A Year in the Life appeared first on InterWorks.

InterWorks