Machete-mode debugging: Hacking your way out of a tight spot

Ned Batchelder @nedbat; maintainer of coverage.py and works at edX?

http://bit.ly/mdebug

Python can be chaotic

  • I love Python for its dynamic nature
  • fundamentally it’s unstructured
  • but we use it in a structured way
  • dynamic typing means anything can be anything
  • there is no data protection; anything can access anything
  • all objects are on the heap and there is not stack allocation

Use it to your advantage

  • Chaos got is into this mess; let’s use the chaos to get us out of this mess
  • Real projects, real problems
  • I won’t tell you which project, since I want you to like OpenEdEX
  • Not for production! Use this in your code for no more than 10 minutes!

Case 1

  • Double importing
  • Classes defined twice
  • Django complains
  • Modules
    • Executed when imported
    • Only executed once!
    • Keeps of dict of loaded modules
    • Searches path if it’s not already loaded
    • Sooooooo… how do we import modules more than once
  • Solution?
    • Used inspect.stack
    • Create a traceback of who called the import
    • Put code right into the module header
    • Two locations were
      • import thing.apps.first.modules
      • import first.models
      • different keys in sys.path, uniqueness check didn’t kick in
      • We had sys.path.append for both of our modules (which is why you shouldn’t do that)
    • Fix was to import everything using the same form
  • Lesson
    • Import really runs code
    • Different from other languages
    • There is no special import mode
    • Never put debugging in a real module
    • hard-code
    • Wrong is OK, b/c we just need to get the information
    • Don’t append to sys.path
      • Choose a disciplined way to do it

Case 2

  • Finding /tmp file creators
  • We had tests that would make tempfiles
  • Some would add a clean up, but some would not
  • Run your test suit, have 20 temp files lying around
  • How do we find them?
    • I can’t grep the whole code base
  • Other languages have great static analysis tools, but python is fundamentally dynamic and we cannot
  • “Notice that I’m upgrading ‘grep’ to ‘static analysis tools”
  • How about we use a flare in the file itself: or into the filename
  • Let’s monkeypatch the standard library
tempfile.mkdtemp = my_sneaky_function

# Unsuspecting product code
import tempfile
tempfile.mkdtemp() # Calls my_sneaky_function
  • Of all the things that I’m telling you not to do in production, this is the most important
  • Read the tempfile module source!
    • handful of different functions
    • only want to tweak the filenames
    • _get_candidate_names()
  • Running the code on startup
    • have to monkeypatch before the function gets called
    • Python doesn’t have that feature
    • Does have site-packages/*.pth
    • When Python starts up, checks for line.startswith("import ")
      • “I’m showing you a lot of weird code. I didn’t write this.”
  • The sneak
    • Change get_candidate_names, and append the sys.stack to the beginning of the random path name
    • tmp-case53-case278-…-random.py
  • Lessons
    • Std lib is readable!
    • Std lib is patchable!
    • Use whatever you can touch and change
      • You only have to feel really, really bad about yourself for 10 minutes.
    • Do use addCleanup. It’s a much nicer way to clean up your tests

Case 3: Who is changing sys.path?

  • sys.path is being modified incorrectly
  • grep didn’t find it
  • Must be in third-party?
    • We’re not going to grep all of the 3rd party deps
  • We want a data breakpoint
    • pdb doesn’t have them
    • write a trace function
      • you can write a function that gets called for each line of program that gets called
def trace(frame, event, arg):
    if sys.path[0].endswith("/lib"):
        pdb.set_trace()
    return trace

sys.settrace(trace)
  • 50/50 chance that this will not work at all
  • But it took me 1 minute to write this, so what do I have to lose
  • Found it: nose
  • Lessons
    • It’s not just your code
    • Dynamic analysis is very, very powerful
      • This was a very expensive thing to do
      • But it was very early on that we hit this break point
      • Even if it took 8 hours, that’s faster than doing it some other way
    • Sometimes you have to use big hammers

Case 4: Why is random different?

  • Randomized, but repeatable problems
  • ```random.seed(1702) # student.seed
  • First time: 284
  • Subsequent times: 420
  • Something wrong during import?
  • 1/0
    • Easy to drop in; 3 characters
    • Unlikely exception: ZeroDivisionError
    • Favorite three-character expression
# monkey-patch random with a trap
import random
random.random = lambda 1/0
  • Booby-trapped random
  • 3rd part tests actually had a default value for the random seed which was random.random()
  • Bonus: the value was never used!
  • We reported the bug and they fixed it
  • Lessons
    • Exceptions are a good way to get information
      • If no one catches it, it will come all the way back up to the top
      • And you can put information in your Exception
    • Don’t be afraid to blow things up
    • Sometimes you get lucky
      • I tried the simplest thing that I could think of and it worked
      • Sometimes
    • Don’t over engineer things, that’s what machete mode is all about
    • Don’t share global state
      • Shared mutable state is a very, very difficult thing (ahem VIC)
    • Do suspect third-party code
      • Just importing their main code was importing their test helpers
      • It was a bit sloppy, but we’re all in this together

Big Lessons

  • Break conventions to get what you need
  • But only for debugging!
  • Dynamic analysis is your friend
  • Understand Python!

Questions?

  • Q: How do you fix the 3rd party problem until it gets fixed? A: Sometimes we have to fork. In this case we had an other option.
  • Q: We’ve seen the answers… what about the thought process? A: No good answers, but think outside the box and understand that anything is possible and that you can play around with that malleability.

Pyjion: who doesn’t want faster for free?

Brett Cannon, Dino Viehland (Microsoft)

  • Python, JIT, … want to introduce a JIT API to CPython… we hope
  • This would give us faster execution, but compatibility for all of the C standard library stuff
  • We don’t want to burden Python with a JIT API, but create a JIT space race for Python
  • We use CoreCLR JIT. We’re not married to it, but we think it works reasonable well.
  • Whether our JIT wins, we don’t care. We want Python to win by having a JIT.
  • We want to have shown that a JIT works.

Why?

  • Because faster is always nicer
  • … especially when it’s already compatible with your stuff.
  • Dino thought, “how hard can this be?”
  • Started to get stuff working at PyCon last year
  • In our spare work time, we have been doing this at work
  • How does it compare to …
    • PyPy
      • Toolchain to generate a JIT
      • Not 100% compatible with all extension modules (e.g. CFFI)
    • Pyston
      • Coming out of DropBox
      • Just hit 0.5 this week
      • Re-uses large portions of CPython to maintain compatibility
    • Numba
      • Numeric specific JIT
      • Continuum Analytics
      • Have to decorate your numeric specific code
      • Supports GPUs
    • Psyco and Unladen Swallow
      • Psyco was stopped and became PyPy
      • Swallow was shut down after a year of fighting with bugs in LLVM
  • We are the only one that hit our compatibility numbers
  • Only one that supports Python 3.5

How?

  • High-level overview
    • JIT at the code object level
    • Gives exposure to anything with a scope
    • Use MSIL
    • Use CoreCLR as an assembler, so, no, we’re not using .Net into Python
    • Use CPython’s C API to maintain compatibility
  • Changes to CPython’s C API
    • Gives python a frame evaluation API
    • There’s a function PyEval_EvalFrameEx() that takes a frame, evaluates it or raises an exception
    • Opens the door to much better debugging
    • Adds an object ->co_extra where we stuff all of the JIT stuff
      • i_run_count
      • i_failed (i.e. we cannot compile generator functions)
      • i_evalfunc
        • trampolene function that tracks what types are coming into this function
        • can be shared
      • i_evalstate
        • extra place to squirrel away extra data
      • i_specialization_threshold
        • want to put limits on how many versions we want to optimize in case you have a hugely polymorphic function

Bumps in the road

  • CPython has two separate stacks, one for execution and one for exception handling
  • CPython has very complicated end finallys
  • iteration opcodes leave something on the stack after every loop
  • error checking everywhere …

Performance

  • 41 benchmarks
    • 14 are slower
    • 12 are the same
    • 15 are faster
  • String manipulation does pretty bad
  • Future optimization
    • function inlining

When?

  • PEP changes in CPython is out for review
  • Python is compatible enough today
  • C++ framework for JITs is still just an idea
  • We’re numpy compatible; it “just worked”

https://github.com/Microsoft/Pyjion

Q/A

  • Q: Code objects are supposed to be immutable… sooooooo? A: We don’t let you mutate the object from the Python code, but this is the most controversial part of the proposal. But even if it impacts it now, we should be good to go in the future.
  • Q: CFFI has multiple backends… will you augment that? A: We don’t need to… you just compile it for CPython and it just works.

Code Unto Others

Nathaniel Manista, Augie Fackler

1700 lines of code, 17 public method and 45 public attributes

  • Is this a problem?
    • Not for the compiler
    • Not for the interpreter
    • Not for the users
  • Software is Made of People
  • Readability: Your software needs to describe itself to reader the way you would describe it.
  • You don’t scale
  • You don’t get to have a 1-on-1 conversation with everyone
  • You need your software to describe itself
  • Correct, Efficient, Readable, Self-actualization, Kill all humans
  • Lack of Cohesion
    • Kind of a grab back of a bunch of different things
    • At least 3 different layers of things in one component
    • Data container that hangs extra things to the side
    • Mixes function and convenience
    • At least has the decency to say that it’s for convenience, but still adds to the cognitive burden to understanding it
    • Mixes layers of abstraction
    • Too many elements in its API
    • 7 +/- 2 is the working memory for humans
    • This has 92 for a client, more if you’re a dev
    • We can’t even talk about what this does, with out using subdivision. This does this and this.
    • Text is too long. I shouldn’t have to put a pot of coffee on to read code.
  • Example: def long_function()
  • A long function is a tax on working memory
  • localrepository wasn’t inflicted by a malevolent enemy
  • Requirements of any software change
    • must meet a need in the domain of the software
    • A change’s author must understand the problem sufficiently to create the change
    • Most of the time we guess about the lifetime of code, we guess short, and the code outlives our projections
  • Silver bullets
    • “Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables…” Mythical Man Month.
    • Mastering a particular problem domain can’t generically be made easier, but readability and maintainability emerge from code.
    • If it’s apparent where state is changed and how program control is passed from one component to another, most of the puzzle is solved.
    • Out params, and mutable global state both obsfucate the program control flow.
  • Why do we perform experiements in labs?
    • Because we want to screen out background interference
  • Why do we manufacture goods in

  • “Do where the doing is Simplest”
  • “Code where the Coding is Simplest”
class bar(foo):
  ANSWER = 42
  • Placing code at class-scope invites questions
  • Avoid placing code elements at class-scope unless you have no alternative.
    • Don’t invite questions
    • The class “looks nice” with it there. NO!
    • I really want it there! NO!
    • It’s only ever used from the class. NO!
  • The class cannot function as requried by its users without the code element at class scope.
  • Place all code elements at module-scope by default.
  • Classes: module-scope.
  • Functions: module-scope.
  • Constants: module-scope.
  • Be a class realist
    • provide a way to structure and aggregate data
    • purely abstract classes define types
    • classes implement types
    • create arbitrarily many instances that behave in ways that are mostly similar but different according to the values used at construction
  • Maintainable classes typically avoid mixing these. 3/4 are OK but not the others.
  • Classes are not for
    • BEING FUNCTIONS
    • Being concrete, but never being instantiated
      • Python has namespaces, they’re called modules
    • Enumerated polymorphism.
      • You want to create a type
      • But you have three or four ideas of the only types that will only will ever exist.
      • Use the ABC module. Don’t abuse base classes.
    • Mixins
      • (Eww, yuck!)
      • Your class cannot use them without becoming them

Design of Classes

  • Avoid self-use of public APIs
  • Avoid self-escape in class implementation
    • never pass the self reference out of your API
    • calls into question whether other parts of your system are at a lower or higher level of abstracion
    • pass something less than self
  • Minimize instance state
    • tracked down a huge memory leak when a long-lived instance stuffed things into self

Design of Modules

  • Always place imports at the top of your modules
    • You should always have a directed acyclic graph of modules
  • Default to putting all of your code elements at private visibility
    • only “promote” them to public when you make them a deliberate and intentional part of the API

Advanced Techniques

  • There are times in your life when your judgement is comprimise
  • You’ve had a problem for a week and you’ve been desinging, testing, coding, you have the problem on the brain and you’ve cracked it.
    • Are you the best person to decide what’s obvious/self-evident? No.
  • Line limits and complexity limits on functions, classes, and modules
    • “Once you understand everything that’s involved in horckleblaxing a fnast… it’s obvious why this function cannot be made any simpler”
  • Intermediate Concepts
    • Go ahead and define those intermediate concepts. If the code is really long, there are other things to be defined. Better too obvious, than not obvious enough.
    • They seem like they’re adding complexity, but it’s only when you’re in the moment when it seems obvious
  • Don’t abbreviate when naming
  • Dramatis Personae
    • We don’t like to be scared or surprised when we’re reading our source code
    • This is the point when we print out source code and
  • Sort your code elements in definition-before-use order
    • This is not something you do every day when you’re writing source code
    • private elements will float to the top
    • private elements will scroll to the bottom
    • we get pushback
    • maintainers and users are two different audiences of readers
    • maintainers have to be there
    • users don’t want to be reading your code
    • Instead, provide generated API docs so that users can read it there.

Code Unto Others

  • Readability is independent of correctness, efficientcy and problem domain
  • Set your judgement aside


blog comments powered by Disqus

Published

30 May 2016

Category

work

Tags