Lightnight talks

  • Prompt_toolkit 2.0, dbcli

Setting expectations for Open Source participation

Brett Cannon

Video is here.

  • Dev lead for the Python extension of Visual Studio Code
  • Python core dev (since April 2003)
  • Contributor to over 80 OSS projects (most of those are fixing typos)
  • I have lived corporate OS and community OS

  • What is the purpose of an OS project/community?
    • To collaborate on the maintenance of a project
    • To have fun
    • We want to keep the project going
    • But we also want enjoy it
  • How to sustain?
    • Attract new people
    • Retain current people
    • There’s always attrition
      • boredom
      • death
    • Tricky balancing act

What’s the goal?

  • Set reasonable expectations so that OS works for everyone
    • many people are volunteers
  • Not always succeeding
    • Cory Benfield
      • “I think working in OSS has made me more bitter and short-tempered”
      • Maintainer of the requests library
      • Sad that we have to worry about his spouse asking him: “Quit doing Open Source. It’s making you a terrible person.”
  • I had to take a month off in 2016 due to burnout
  • People tend to forget 2 things
    • Everything in Open Source has a cost
      • There is still time, effort, and emotional output
      • If you send me a PR, you are asking me to take time away from my wife and family
      • I only have so much time on this planet and I try to use it wisely
  • Open source should be a series of unsolicited kindnesses
    • It’s like people giving you “gifts” that you didn’t necessarily ask for
    • No one would feel like they are being abused or misunderstood

Scenarios

  • Meet Stuart and Brett
  • Using Open Source
    • Brett -> Stuart
    • Someone leaving out pamphlets you find offensive
  • Providing feedback
    • Stuart -> Brett
    • can be like a family member telling you, “you’re stupid”
    • Upvoted on HN: I like Python… but… this is stupid, that’s dumb,
      • That’s my fault
      • That’s my friends fault
      • That doesn’t make sense
      • “Hey, I said I liked it
      • Call a feature stupid, you’re calling me stupid
      • The Internet isn’t written in pencil, it’s written in ink
      • If you don’t understand why it’s written that way, come to python-dev and ask!
      • You can still have a critical question, without being mean about it
  • Submitting a contribution
    • Stuart -> Brett
    • It can be like someone who tries to give you a puppy you didn’t want
    • Like someone saying, “I’m fixing your mess”
    • I’m taking on the responsibility of that puppy pooping and walking for the next 10 years
    • There are over 7 million of us.
    • Am I going to make millions of books printed on dead trees obsolete
    • Here’s a PR because something is really broken. That’s really negative.
  • Contribution feedback/Acceptance
    • It can be like someone saying, “you’re doing it wrong” or reacting with “why don’t you love me?!?”
    • I can’t do this; I don’t want to breaking people’s code
    • I’m mad at you because you’re preventing me from contributing
  • Maintaining
    • It can be like arguing with your siblings about politics
    • I’ve almost gone to tears, arguing with another core dev at PyCon
    • Sounds like I’m biased towards maintainers
    • Scale problem; it’s very easy for a 10th of the community to overwealm maintainers in negativity
  • Kindness does not require anything in return
    • When you send me that PR, I may have to say no for a number of reasons
    • If I can’t take it, I’m sorry, but it’s not personal
    • If I ask you for changes and you can’t do it, that’s fine.
    • It’s not a bargain
  • How should we act towards each other
    • Open (to people doing kindness for us)
    • Considerate
    • Respectful
    • Turns out that this is the PSF code of conduct
    • “Three-way handshake of kindness”
  • How should we communicate
    • Assume you are asking me a favour
      • “Review my PR right now, damnit!”
    • Assume your boss will read what you say
      • I pay attention to what company people work for
      • And I have a very good memory
    • Assume your family will read what you say
  • Pay for OS with kindness
    • Otherwise it leads to burnout
    • We have an amazing community that is know for being respectful and kind
    • I can’t imagine what it’s like in any other OS community

Type-checked Python in the Real World

  • carljm
  • officially old

Video is here.

Why type

  • I’ve been using Python for years. Why do I care?

def process(self, items)

  • What is items?
    • duck typing
    • a collection
    • it has stuff
  • Code is written once, but maintained in a long time
  • The contract that we just described, I have to read through everything, line-by-line
  • How do I know that I’m conforming to this contract everywhere
  • Maybe I need to add some functionality
    • How do I know that I’m complying everywhere in my code base?
  • With type annotation, all ambiguity goes away

  • People have been putting the same information into docs for years
  • But at somepoint, someone will update the function signature and not the docstring
    • worse than useless
  • “That’s cool, but I don’t need it; I’d catch it with a test!” ~Pythonista
    • Riiiiiiight. I love tests! But…
    • You don’t need to test things that are impossible and you make things impossible with types

How to even type

  • square.py
    def squary(x: int) -> int:
        return x**2

Let’s type this!

    $ pip install mypy

mypy is the most commonly used type checker, maintained at Dropbox

  • Type inference
    • We’ve described types of inputs
    • it can infer types of assignments
    • it can infer the types of lists/containers

Review

  • Annotate your function signatures
  • Annotate variables that your type checker tells you to

Continue

  • There is a Union or Optional type
  • But, that delegates stuff to the code which should go to the type checker
  • We can use the @overload decorator to have multiple return values
  • Generic functions
    • Define a type variable
    • from typing import AnyStr
  • Where’s my duck?
    • I want to call obj.render() and have it work
    • We could use “Object” or “AnyType”
from typing_extensions import Protocol

class Renderable(Protocol)
  • We found a duck!
  • Structural sub-typing vs nominal sub-typing
  • Strict static typing is great for most producation code.
  • But there are many times when you want to ,,,
  • Escape Hatch #1
    • __getattr__ returns Any. Not great, but better than failing
  • Escape Hatch #2
    • “Normally it returns Any”, but you can tell the type checker that in this case it returns something else
  • Escape Hatch #3
    • Ignore
    • mypy can’t handle a decorator of a decorator (move on with your life)
  • #4
    • “Escape hatch on an industrial scale”
    • stub (pypi) files
    • We have lots of C files for optimizations at Instagram
    • If we put a .so file, its usually because we use them a lot
    • fastmath.so, fastmath.pyi
    • the .pyi file can be checked by the type checker

Gradual typing

  • Typecheck your program, even though not all expressions are typed
  • Errors everywhere, when we introduce a type checking system when there was no typing
  • Simple rule: Only functions with type annotations are checked
    • Don’t even look inside the body
    • We can introduce type annotions where we are prepared to deal with the consequenses
    • Network effect
  • Start with the most used modules, that where you’ll get the most benefit
  • Use CI
    • Once you’ve gotten rid of type errors, you want to make sure that no one adds type error afterward
    • mypy has good options for that
      • don’t allow any Any types in this module from now on
  • Last thing
    • painful if you come back to code later
    • yeah… you have to deal with that when you come back to do type annotations
    • Our CTO was the first person to dive into type annotations
    • Came back 2 weeks later and said “I’m done”
    • Let’s write a tool…

Monkeytype

$ pip install monkeytype # of course
$ monkeytype run mytest.py
$ monkeytype setup some.module

Shows exactly what types it thinks should be done

$ monkeytype apply

The future

  • 3.7: no more ugly string forward references
  • Fewer imports from typing module: dict not typing.Dict
  • PEP 561

Conclusions

  • Type-checked Python is here and it works
    • We prevent landing-diffs if they have type-erros
  • It catches bugs and developers love it
  • With monkeytype, you can annotate large legacy codebases
  • Early days, far from perfect bug good engough
  • It will get better

  • Pyre: we have swtiched to that. Written at FB. 5 minutes -> 45 seconds

The Rabbit and the Hare: Getting the most about RabbitMQ

Wouldn’t recommend watching this talk but here is the video anyway.

Overview

  • What is it?
    • Message Broker
    • What’s that?
      • A server that receives messages from parts of a distributed system
      • Like AWS SQS or Google …
  • Written in Erlang in 2007
    • 3500 deployments
    • Most popular OS message broker
  • Producers and consumers
    • senders and receivers
  • Queue
    • objects that live on the RabbitMQ broker
    • FIFO
    • Properties
      • Name
      • Durable (will persist if the broker restarts)
      • Exclusive
      • Auto-deleting (will be destroyed once the last consumer disconnects)
      • Optional
  • Exchanges
    • Objects that accept messages and then route them
    • Producers do not directly send messages to queues
    • Exchanges can have properties too
    • Types
      • Direct
        • Reads message, finds binding, and routes message
        • Multiple keys can be bound to multiple queues
      • Topic
        • Wildcards, exactly one word, words are separated by periods
        • Routing key “foo.#” (routed to queue A)
        • Key “#.baz” (routed to queue B)
        • foo.*.baz (routed to queue C)
      • Fanout
        • Routes any message to all queues that it is bound to
        • Like multi-cast routing
        • Can be replicated with topic by setting all binding keys to “#”
  • Bindings
    • Connects an exchanges to a queue
    • Has more metadata like a binding key *

How to interact from Python

  • Communicate with AMQP (Advanced Message Queue Protocol)
  • librabbitmq
  • py-amqp
  • pika
  • kombu

https://github.com/sklarsa/

Celery

The main place where you’ll see RabbitMQ in the Python ecosystem

  • What is it?
    • A vegetable. Great with PB and raisins.
    • Run functions on remote servers.
  • Basic building block is a task
    • Wrap a function with “@app.task”
    • celery will pass instructions to the exchange/queue
    • A celery working will retrieve the function and pass it to a worker
    • “.delay()” function means go out to a worker

Case Study

  • Problem
    • Time intensive computations occur nightly
    • System performance degrades during this time
    • Use two queues?
      • Long tasks and short tasks?

Gotchas

Big-O: How code slows as data grows

Net Batchelder

http://bit.ly/bigopy

  • Two mindsets
    • CS: Math, abstract
    • SE: Pragmatic, does it work?
  • Some crossover (not much)
    • Big-O!

Big-O:

  • How your code slows as data grows
  • Not the same as running time
  • The trend over time (as the data gets larger)
  • 10x data -> ??x time
    • 10x? Not true!
  • Mathy… but doesn’t have to be

Terminology

  • O(blah blah N blah)
  • N: how much data
  • O = “Order of”
  • It is not (really) a function… (well, it kind of is)

Counting beans

  • O(N)
  • N doubles, the time doubles
    • for x in my_list
  • Well if you get beans with labels on them, you don’t need to count them
    • O(1)
    • Weird mathmmetitions way of saying that N isn’t involved
    • len(my_list)

Finding words

  • Novel
    • O(N)?
  • Encyclopedia
    • Ordered
    • O(log N)
  • How you organize the data affects your run time

Other terms

  • O(1): constant time
  • O(N): linear time
  • O(n**2): quadratic time
  • Big-O:
    • complexity
    • asymptotic complexity

Deteriming Big-O

  • Identify your code
    • Seems silly, but in a large system you need to consider all the callers
  • Identify N
    • What are you measuring?
    • Lenght of string? Database entries?
  • Count the steps in a typical run
  • Keep the most significant part
    • As N gets higher and higher, the lower terms get less and less significant

Ideal: O(1)

  • len(mylist)
  • mydict[some_key]

Python complexities

https://nedbatchelder.com/text/bigo/bigo.html#13

If you have a program that’s really slow? Look to see if you are regularly looking for a value in a list.

Advanced: Amortization

  • Long-term averaging
  • Operations can take different times

Advanced: Worst case

  • Typical case vs. worst case
  • Dicts also
  • Hash randomization

https://nedbatchelder.com/blog/201711/toxic_experts.html

https://nedbatchelder.com/text/bigo/bigo.html#21

Code Sprints

  • Bandit: a Python static analysis security tool
  • Zulip: open-source Slack-like chat.
  • Certbot: EFF supported program for getting certificats from Let’s Encrypt
  • PyPI/Warehouse: PyPI has undergone huge improvements in the last couple of years
  • virtualenv-wrapper
  • pipenv


blog comments powered by Disqus

Published

13 May 2018

Category

work

Tags