Morning Keynote

Lots of history of Perth, Australia (the speaker’s home) as metaphors for how to lead the community into the future.

Whar are Python’s “Black Swans” (things that you don’t know that exist or will happen, but are obvious in retrospect)? How could the community be upended in ten years?

  • The ubiquity of phones (that don’t run Python)
  • Not needing to run Python on the server (you can do that with JavaScript)

Calls to action

  • Research and Development
  • Donate (or have your employer donate) to OSS

Practical decorators

Couldn’t get a seat. Fire code violation. Watched the video afterwards, and it was a solid talk.

Terrain, Art, Python and LIDAR

Andrew Godwin

Django contributor Principle Engineer at Eventbrite (“I always need more lasers”).

  • SRTM
    • Shuttle RADAR Topography Mission
    • 30m accuracy
    • Works well on wide scales. Mountains and valleys well. Can’t see around valleys or mountains. Good start, but we have progressed.
  • LIDAR
    • Accurate to cm or better.
    • Usually on an aircraft, but can be on a car or something else.
    • Many major US cities have LIDAR flights flown regularly

Laser-cut profiles

How do you get data?

DEM: Digital Elevation Model. A big bitmap that contains elevations.

Laser cutting, need a “cut path”. Construct a series of SVG profiles.

Load a DEM into CSV (“because I’m lazy”)

Picks one in N rows.

Draws a contour using svgwrite.

But you need to exaggerate in the z dimension by 2.5x.

Works well for small items, but not for anything bigger. E.g. Scottland, looks really flat.

3d printed cities

Cities are very precise. Accurate to half a meter. You can see individual trees, cars, railroad lines.

Fabrication is not the problem. The problem is the data.

LIDAR data comes in “point clouds”. Here’s the raw data from flying the city. Need to turn it into a solid surface.

Point cloud -> DEM

python-pcl lastools

  • Top surface -> Fully sealed 3D model (with the tile base) to give the 3D printer. TIN?
  • Load the DEM… obvious outliers… get rid of 50m pit in the middle of London.
  • Climps height (top and bottom).
  • Smooths rough feature (array pass in Python).
  • Writes out and STL file (pretty common 3D model format… TIN).
  • How do you write STL?
    • Python’s struct.pack()
  • Should I have used NumPy? Yes.
  • Did I? No.
  • https://github.com/andrewgodwin/lidartile

CNC-milled National Park

You need a milling machine. Subtract from metal.

The the US National DEM

Get the outline of the National Park

Use QGIS to cut out a park-only DEM

Toolbox > GDAL > Clip By Extent

Irregular shapes, not square.

8 hours of milling for each tiny little park

Map Projections

  • Things I won’t work with:
    • unicode
    • names
    • timezones
    • currencies
    • networks
    • map projections (congratulations!)

Future

More US National Parks

I do each one as I visit it. There are… 59 (actually 61).

Easier Milling

8 hours per piece. Really.

Tiny milling bits, I have broken many.

G-Code can pass instructions directly to the

Better STL optimisation

Millions of polygons isn’t great. Even a totall flat lake bed has many, many polygons.

Blender… 3d modelling program.

Personal LiDAR. Thanks autonomous vehicles.

https://github.com/andrewgodwin/gis_tools

From days to minutes, from minutes to milliseconds with SQLAlchemy

Leonardo Rochael Almeida

Not an expert in any of these things, but want to pass on my lessons learned

Work at Geru, Brazillian Fintech.

Backend stack: Python, Pyramid, SQLAlchemy, Postgres (also, Celery, MongoDB, Java)

SQLAlchemy: 2 aspects

The expression language (a Python DSL) + the ORM (classes mapping to tables and records mapping to instances of the class). You can use one without the other.

“Frameworks still require you to make decisions about how to use them…” Martin Fowler.

The ORM Trap

The Database is an external system, so you do have to think about it. If you don’t, sensible Python code -> Bad SQL access patterns.

The ORM does not help you write good, performant SQL.

Worse, this is unnoticeable at low data volumes, like during development. And Early Production. THEN when you’re getting success on your system, things start slowing to a crawl.

SQLAlchemy is so good that the metaphor rarely leaks.

The Fix: Let the DB do its job. Be especially mindful of the number of roundtrips that the database lookup happens. Especially when you are looking at relationships.

Be aware of implicit queries, especially from relationships

Aim for O(1) queries per request/job/activity

Avoid looping through model instances. It’s tempting, but let the DB do it for you.

Geru Case 1: the 24+ hour reports

It now takes minutes

The Geru Funding Model

Debenture Holder buys debentures to Geru, Borrows grands loans to individuals. Every month the borrow pays back money and the pay back debenture.

We run a report at the beginning of every month collecting everything that the borrow paid and run that against everything that we’ve ever paid debenture. Eliminate rounding errors, etc.

So I replaced a loop over a huge series of attributes, but sum()ing and coalescing in SQL. Push the calculations into the database.

Refactored method that all accepted start and end period. Put that into a method that assembles filters.

INSERT .from_select(). Ask PostgreSQL to insert data from it’s own select, which means you don’t have to send data across the wire. Brought time from 4 hours to 15 minutes.

Conclusions

  • You need to understand SQL even if you are using the DSL and the SQLAlchemy.
  • “Rule of leaky abstractions”
  • Read the SELECT Documentation from the database
  • GROUP BY X aggregations functions
  • Understand windowing functions, calculating aggregates without reducing dimensionality
  • THEN study SQLAlchemy
    • Be aware of the underlying queries
  • Push work to the DB
    • As much as possible
    • But not too much

Only select the amount of information that you need

If you are bound to a loop… don’t do a complex query before you enter the loop. Be mindful of the amount of queries that you’re making and

Everything at Once: Python’s Many Concurrency Models

Jess Shapiro

Intro

  • Doing multiple things “at once”
  • Isn’t just “on” or “off”
  • Many available options in Python
  • Asyncio coroutines
  • Python threads
  • GIL-related threads
  • Multiprocessing
  • Distributed tasks

Parallelism

  • Do things actually happen simultaneously?
    • How does performance scale when you add more CPUs
    • Good diagram to illustrate this
    • Even models w/o “true” parallelism are useful

Minimum Schedulable Unit

  • Code is made up of semantic chunks
  • How big are the chunks that can be run independently

Data Sharing and Isolation

  • How isolated is data between tasks?
  • How long does data stay the same for?
    • E.g. banking info
  • What tools can be used to share data?
    • E.g. Share data by default: locks, flags (redunce concurrency)

Asyncio coroutines

  • One coroutine runs at a time
  • MSU: “Awaitable block”
  • Global state is shared within awaitable blokc
  • Event loop
  • Awaitable block is between def and await and between awaits
  • Example of well engineered vs. another block where synchronous block runs for too long

Python Threads

  • One thread runs (GIL)
  • MSU: “Bytecode”
  • Global state is shared but consistent only for single-bytecode ops*
  • Combined scheduling
  • GIL released threads
    • Multiple threads run simultaneously
    • MSU: Host processor instruction (x86, etc)
    • Global state is shared but unreliable
    • OS-scheduled
    • You would only want to do this if you’re implementing a core algorithm in C, Rust, etc. and provide a robust API to it.
      • Don’t touch the Python interpreter while you do that

Multiprocessing

  • Multi processes run simultaneously
  • MSU: Host processor instruction (x86, etc)
  • Global state starts the same as parent, but evolves independently
  • OS-scheduled
  • fork() system call

Distributed tasks

  • Multiple tasks run simultaneously
  • MSU: varies: often the entire application for some subset of data
  • Global state totally independent: often “process-like”
  • Central orchestrator
  • E.g. 1 TB of data split by orchestrator and sent indepenednetly to workers
  • Can be a lot of overhead using this concurrency model
    • Want to use this only with lots of data …. 100s of GB to 100s of TB to 100s of EB

When to use each

  • Asyncio
    • Great when performance is is IO bound
    • Network requests
    • Disk read/write
    • Doesn’t allow us to gain CPU performance
    • Great for starting a new code base from scratch
      • Legacy code may often have had threading
  • Threads
    • When you need preemptive multitasking (task need to interrupt each other)
    • Integrate synchronous code
    • Need fine-grained concurrency
    • Python “glue” for GIL-unlocked C/Rust
  • Processes
    • Don’t need substantial inter-task communication
      • Share data with pipes, memory-mapped regions
    • Full parallelism required for Python code
    • If you can easily split the data apart, do this
  • Distributed tasks
    • Highly-segmentable and distributable workload
    • Need for share dstate minimal
    • Large enough to use an orchestrator

https://carbon.now.sh

Thanks to Mazarine on Market for avacado toast and esspresso

@haikuginger on GitHub

Questions

  • I switch from Threads to asyncio by switching jobs. How do you actually migrate?
  • We wanted to share so much state, that there wasn’t enough RAM
    • If you have a large enough dataset, move to distributed.
    • Go out to the filesystem
  • Will you be posting your slides
    • Eventually. I don’t know where.

Understanding Python’s Debugging Internals

Liran Haimovitch from Rookout

  • Co-founder and CTO of Rookout
  • Advocate of modern software methodologies
  • Passion is to understand how software actually works
  • Rookout
    • platform for live-data collection and delivery
    • on demand within seconds
    • set non-breaking breakpoints, no restarts, extra coding or redeployment required

Debuggers

  • pdb
  • ipdb
  • PyDev debugger (PyCharm, Eclipse)
  • Rookout
  • IDLE

What do they have in common? Based on sys.settrace()

sys.settrace()

  • register a callback for the Python interpreter
  • invoked on interpreter events
    • function call
    • line exec
    • function return
    • exeception raise
  • E.g.
def simple_tracer(frame, event, arg):
    pass

There exists global_trace and local_trace. Local tracing has a significant performance impact. So we can customize which functions we want to trace and which we want to ignore.

Multithreading?

  • threading.settrace()
    • Must be called as early as possible - or you’ll miss threads
    • Doesn’t cover the underlying ‘thread’ module and other low-level implementations
  • gevent/eventlet
    • Global tracing function will be shared among greenlets

How to build a debugger?

Inherit from Bdb

class Debugger(Bdb):
   def __init__(self):
        Bdb.__init__(self)
	self.breakpoints = dict()
	self.set_trace()
  • Add set_breakpoint
def set_breakpoint(self, filename, lineno, method):
    self.set_break(filename, lineno)
    try:
        self.breakpoints[(filename, lineno)].add(method)
    except KeyError:
        pass


def user_line(self, frame):
    if not self.break_here(frame):
        return

    (filename, lineno, _, _, _) = inspect.getframeinfo(frame)

    methods = self.breakpoints[(filename, lineno)]
    for method in methods:
        method(frame)

Performance Testing

What will be the performance impact?

Benchmarks

  • Multiple scenarios with each implmenetation
    • test w/o debugger
    • test w/ debugger but no breakpoints
    • test with a breakpoint in the different file
    • test with a breakpoint in the same file
    def empty_method():
        pass
    
    
    def simple_method():
        pass

Performance optimization

  • Avoid local tracing
  • Optimize “call” events
  • Optimize “line” events

We forked bdb, and optimized it. Much better performance. Still big performance hit with breakpoint in the same file.

Used Cython and gain a performance bit to 2.5x

I was getting desperate. Stopped and thought about what I was doing

Insights

  • bdb is naive
  • Performance may be improved by a significant margin
  • becomes gradually harder to improve
  • What happens if we set an empty tracer?
    • 4x slower
  • Turning on tracing, sets up CPython for extra work
    • Some of this is in Python
  • Some comes from C: maybe_call_line_trace eval.c:4384
  • So what did we do?
    • Give up.

Python Bytecode

Rookout way

  • Go to the byte code
  • Find the line of code
  • Insert our breakpoint
  • The interpreter doesn’t care about our breakpoint
  • Defer to the other debuggers when our bytecode gets called

Tools for Bytecode mainipulation

  • Python Standard Library (read-only)
    • inspect
    • dis (short for “disassemble”)
    • There is no way to write bytecode in memory
  • Google
    • could-debug-python

After you break?

>>> inspect.getframeinfo(inspect.currentframe())
def test_vars():
    mystr = "mystr"
    mydict = {'foo': 'bar'}
    mylist = [1, 2, 3]
    print(inspect.currentframe().f_locals)

Use cases



blog comments powered by Disqus

Published

03 May 2019

Category

work

Tags