PyCon 2019: Day 0

Morning Keynote

Lots of history of Perth, Australia (the speaker’s home) as metaphors for how to lead the community into the future.

Whar are Python’s “Black Swans” (things that you don’t know that exist or will happen, but are obvious in retrospect)? How could the community be upended in ten years?

The ubiquity of phones (that don’t run Python)
Not needing to run Python on the server (you can do that with JavaScript)

Calls to action

Research and Development
Donate (or have your employer donate) to OSS

Practical decorators

Couldn’t get a seat. Fire code violation. Watched the video afterwards, and it was a solid talk.

Terrain, Art, Python and LIDAR

Andrew Godwin

Django contributor Principle Engineer at Eventbrite (“I always need more lasers”).

SRTM
- Shuttle RADAR Topography Mission
- 30m accuracy
- Works well on wide scales. Mountains and valleys well. Can’t see around valleys or mountains. Good start, but we have progressed.
LIDAR
- Accurate to cm or better.
- Usually on an aircraft, but can be on a car or something else.
- Many major US cities have LIDAR flights flown regularly

Laser-cut profiles

How do you get data?

DEM: Digital Elevation Model. A big bitmap that contains elevations.

Laser cutting, need a “cut path”. Construct a series of SVG profiles.

Load a DEM into CSV (“because I’m lazy”)

Picks one in N rows.

Draws a contour using svgwrite.

But you need to exaggerate in the z dimension by 2.5x.

Works well for small items, but not for anything bigger. E.g. Scottland, looks really flat.

3d printed cities

Cities are very precise. Accurate to half a meter. You can see individual trees, cars, railroad lines.

Fabrication is not the problem. The problem is the data.

LIDAR data comes in “point clouds”. Here’s the raw data from flying the city. Need to turn it into a solid surface.

Point cloud -> DEM

python-pcl lastools

Top surface -> Fully sealed 3D model (with the tile base) to give the 3D printer. TIN?
Load the DEM… obvious outliers… get rid of 50m pit in the middle of London.
Climps height (top and bottom).
Smooths rough feature (array pass in Python).
Writes out and STL file (pretty common 3D model format… TIN).
How do you write STL?
- Python’s struct.pack()
Should I have used NumPy? Yes.
Did I? No.
https://github.com/andrewgodwin/lidartile

CNC-milled National Park

You need a milling machine. Subtract from metal.

The the US National DEM

Get the outline of the National Park

Use QGIS to cut out a park-only DEM

Toolbox > GDAL > Clip By Extent

Irregular shapes, not square.

8 hours of milling for each tiny little park

Map Projections

Things I won’t work with:
- unicode
- names
- timezones
- currencies
- networks
- …
- map projections (congratulations!)

Future

More US National Parks

I do each one as I visit it. There are… 59 (actually 61).

Easier Milling

8 hours per piece. Really.

Tiny milling bits, I have broken many.

G-Code can pass instructions directly to the

Better STL optimisation

Millions of polygons isn’t great. Even a totall flat lake bed has many, many polygons.

Blender… 3d modelling program.

Personal LiDAR. Thanks autonomous vehicles.

https://github.com/andrewgodwin/gis_tools

From days to minutes, from minutes to milliseconds with SQLAlchemy

Leonardo Rochael Almeida

Not an expert in any of these things, but want to pass on my lessons learned

Work at Geru, Brazillian Fintech.

Backend stack: Python, Pyramid, SQLAlchemy, Postgres (also, Celery, MongoDB, Java)

SQLAlchemy: 2 aspects

The expression language (a Python DSL) + the ORM (classes mapping to tables and records mapping to instances of the class). You can use one without the other.

“Frameworks still require you to make decisions about how to use them…” Martin Fowler.

The ORM Trap

The Database is an external system, so you do have to think about it. If you don’t, sensible Python code -> Bad SQL access patterns.

The ORM does not help you write good, performant SQL.

Worse, this is unnoticeable at low data volumes, like during development. And Early Production. THEN when you’re getting success on your system, things start slowing to a crawl.

SQLAlchemy is so good that the metaphor rarely leaks.

The Fix: Let the DB do its job. Be especially mindful of the number of roundtrips that the database lookup happens. Especially when you are looking at relationships.

Be aware of implicit queries, especially from relationships

Aim for O(1) queries per request/job/activity

Avoid looping through model instances. It’s tempting, but let the DB do it for you.

Geru Case 1: the 24+ hour reports

It now takes minutes

The Geru Funding Model

Debenture Holder buys debentures to Geru, Borrows grands loans to individuals. Every month the borrow pays back money and the pay back debenture.

We run a report at the beginning of every month collecting everything that the borrow paid and run that against everything that we’ve ever paid debenture. Eliminate rounding errors, etc.

So I replaced a loop over a huge series of attributes, but sum()ing and coalescing in SQL. Push the calculations into the database.

Refactored method that all accepted start and end period. Put that into a method that assembles filters.

INSERT .from_select(). Ask PostgreSQL to insert data from it’s own select, which means you don’t have to send data across the wire. Brought time from 4 hours to 15 minutes.

Conclusions

You need to understand SQL even if you are using the DSL and the SQLAlchemy.
“Rule of leaky abstractions”
Read the SELECT Documentation from the database
GROUP BY X aggregations functions
Understand windowing functions, calculating aggregates without reducing dimensionality
THEN study SQLAlchemy
- Be aware of the underlying queries
Push work to the DB
- As much as possible
- But not too much

Only select the amount of information that you need

If you are bound to a loop… don’t do a complex query before you enter the loop. Be mindful of the amount of queries that you’re making and

Everything at Once: Python’s Many Concurrency Models

Jess Shapiro

Intro

Doing multiple things “at once”
Isn’t just “on” or “off”
Many available options in Python
Asyncio coroutines
Python threads
GIL-related threads
Multiprocessing
Distributed tasks

Parallelism

Do things actually happen simultaneously?
- How does performance scale when you add more CPUs
- Good diagram to illustrate this
- Even models w/o “true” parallelism are useful

Minimum Schedulable Unit

Code is made up of semantic chunks
How big are the chunks that can be run independently

How isolated is data between tasks?
How long does data stay the same for?
- E.g. banking info
What tools can be used to share data?
- E.g. Share data by default: locks, flags (redunce concurrency)

Asyncio coroutines

One coroutine runs at a time
MSU: “Awaitable block”
Global state is shared within awaitable blokc
Event loop
Awaitable block is between def and await and between awaits
Example of well engineered vs. another block where synchronous block runs for too long

Python Threads

One thread runs (GIL)
MSU: “Bytecode”
Global state is shared but consistent only for single-bytecode ops*
Combined scheduling
GIL released threads
- Multiple threads run simultaneously
- MSU: Host processor instruction (x86, etc)
- Global state is shared but unreliable
- OS-scheduled
- You would only want to do this if you’re implementing a core algorithm in C, Rust, etc. and provide a robust API to it.
  - Don’t touch the Python interpreter while you do that

Multiprocessing

Multi processes run simultaneously
MSU: Host processor instruction (x86, etc)
Global state starts the same as parent, but evolves independently
OS-scheduled
fork() system call

Distributed tasks

Multiple tasks run simultaneously
MSU: varies: often the entire application for some subset of data
Global state totally independent: often “process-like”
Central orchestrator
E.g. 1 TB of data split by orchestrator and sent indepenednetly to workers
Can be a lot of overhead using this concurrency model
- Want to use this only with lots of data …. 100s of GB to 100s of TB to 100s of EB

When to use each

Asyncio
- Great when performance is is IO bound
- Network requests
- Disk read/write
- Doesn’t allow us to gain CPU performance
- Great for starting a new code base from scratch
  - Legacy code may often have had threading
Threads
- When you need preemptive multitasking (task need to interrupt each other)
- Integrate synchronous code
- Need fine-grained concurrency
- Python “glue” for GIL-unlocked C/Rust
Processes
- Don’t need substantial inter-task communication
  - Share data with pipes, memory-mapped regions
- Full parallelism required for Python code
- If you can easily split the data apart, do this
Distributed tasks
- Highly-segmentable and distributable workload
- Need for share dstate minimal
- Large enough to use an orchestrator

https://carbon.now.sh

Thanks to Mazarine on Market for avacado toast and esspresso

@haikuginger on GitHub

Questions

I switch from Threads to asyncio by switching jobs. How do you actually migrate?
We wanted to share so much state, that there wasn’t enough RAM
- If you have a large enough dataset, move to distributed.
- Go out to the filesystem
Will you be posting your slides
- Eventually. I don’t know where.

Understanding Python’s Debugging Internals

Liran Haimovitch from Rookout

Co-founder and CTO of Rookout
Advocate of modern software methodologies
Passion is to understand how software actually works
Rookout
- platform for live-data collection and delivery
- on demand within seconds
- set non-breaking breakpoints, no restarts, extra coding or redeployment required

Debuggers

pdb
ipdb
PyDev debugger (PyCharm, Eclipse)
Rookout
IDLE

What do they have in common? Based on sys.settrace()

sys.settrace()

register a callback for the Python interpreter
invoked on interpreter events
- function call
- line exec
- function return
- exeception raise
E.g.

def simple_tracer(frame, event, arg):
    pass

There exists global_trace and local_trace. Local tracing has a significant performance impact. So we can customize which functions we want to trace and which we want to ignore.

Multithreading?

threading.settrace()
- Must be called as early as possible - or you’ll miss threads
- Doesn’t cover the underlying ‘thread’ module and other low-level implementations
gevent/eventlet
- Global tracing function will be shared among greenlets

How to build a debugger?

Inherit from Bdb

class Debugger(Bdb):
   def __init__(self):
        Bdb.__init__(self)
	self.breakpoints = dict()
	self.set_trace()

Add set_breakpoint

def set_breakpoint(self, filename, lineno, method):
    self.set_break(filename, lineno)
    try:
        self.breakpoints[(filename, lineno)].add(method)
    except KeyError:
        pass


def user_line(self, frame):
    if not self.break_here(frame):
        return

    (filename, lineno, _, _, _) = inspect.getframeinfo(frame)

    methods = self.breakpoints[(filename, lineno)]
    for method in methods:
        method(frame)

Performance Testing

What will be the performance impact?

Benchmarks

Multiple scenarios with each implmenetation
- test w/o debugger
- test w/ debugger but no breakpoints
- test with a breakpoint in the different file
- test with a breakpoint in the same file

    def empty_method():
        pass
    
    
    def simple_method():
        pass

Performance optimization

Avoid local tracing
Optimize “call” events
Optimize “line” events

We forked bdb, and optimized it. Much better performance. Still big performance hit with breakpoint in the same file.

Used Cython and gain a performance bit to 2.5x

I was getting desperate. Stopped and thought about what I was doing

Insights

bdb is naive
Performance may be improved by a significant margin
becomes gradually harder to improve
What happens if we set an empty tracer?
- 4x slower
Turning on tracing, sets up CPython for extra work
- Some of this is in Python
Some comes from C: maybe_call_line_trace eval.c:4384
So what did we do?
- Give up.

Python Bytecode

Rookout way

Go to the byte code
Find the line of code
Insert our breakpoint
The interpreter doesn’t care about our breakpoint
Defer to the other debuggers when our bytecode gets called

Tools for Bytecode mainipulation

Python Standard Library (read-only)
- inspect
- dis (short for “disassemble”)
- There is no way to write bytecode in memory
Google
- could-debug-python

After you break?

>>> inspect.getframeinfo(inspect.currentframe())

def test_vars():
    mystr = "mystr"
    mydict = {'foo': 'bar'}
    mylist = [1, 2, 3]
    print(inspect.currentframe().f_locals)

Use cases

Show off your Python skills :)
Get source information (logging module)
Walk up the stack (for profiling)
Build a debugger
https://github.com/Rookout/pycon-debugging-internals
https://www.rookout.com/

← Previous Archive Next →

blog comments powered by Disqus

Published

03 May 2019

Morning Keynote

Calls to action

Practical decorators

Terrain, Art, Python and LIDAR

Laser-cut profiles

3d printed cities

CNC-milled National Park

Future

From days to minutes, from minutes to milliseconds with SQLAlchemy

The ORM Trap

Geru Case 1: the 24+ hour reports

Conclusions

Everything at Once: Python’s Many Concurrency Models

Intro

Parallelism

Minimum Schedulable Unit

Asyncio coroutines

Python Threads

Multiprocessing

Distributed tasks

When to use each

Questions

Understanding Python’s Debugging Internals

Debuggers

sys.settrace()

Multithreading?

How to build a debugger?

Performance Testing

Performance optimization

Insights

Python Bytecode

Tools for Bytecode mainipulation

After you break?

Use cases

Published

Category

Tags

PyCon 2019: Day 0

Morning Keynote

Calls to action

Practical decorators

Terrain, Art, Python and LIDAR

Laser-cut profiles

3d printed cities

CNC-milled National Park

Future

From days to minutes, from minutes to milliseconds with SQLAlchemy

The ORM Trap

Geru Case 1: the 24+ hour reports

Conclusions

Everything at Once: Python’s Many Concurrency Models

Intro

Parallelism

Minimum Schedulable Unit

Data Sharing and Isolation

Asyncio coroutines

Python Threads

Multiprocessing

Distributed tasks

When to use each

Questions

Understanding Python’s Debugging Internals

Debuggers

sys.settrace()

Multithreading?

How to build a debugger?

Performance Testing

Performance optimization

Insights

Python Bytecode

Tools for Bytecode mainipulation

After you break?

Use cases

Published

Category

Tags